Murphy's Law

Things tend to go wrong at the worst moment possible and when you least expect them. And sometimes we see things that are about to go wrong, fix them fast and quick, and move the real solution to the near future. And each and every time we take that bet, each time we choose between solving an issue thoroughly, or postponing it, it comes and bites us in the ass.

Such a think happened today. Our main Directory at the office had a few replica issues for a while now, and two days ago the bugger finally gave out. File sharing started either blocking users from logging in, or gave users access to the wrong folder due to duplicate user id’s. A quick reboot later everything was solved, and I decided to leave it be, and take a decent look at it next week, with an upgrade to 10.8.4 and some other fixes. And guess what, today, half an hour before calling it a day, the OD finally crashed when editing one of the replica’s… And it didn’t just crash, it just decided to show up empty.

A reboot, sudo slapconfig destroy and a restore later, combined with some terminal magic, the server is finally back online, together with a couple of replica’s running smoothly alongside of it. It only took me 5 hours, a lot of frustration and some fingers crossed. But I managed to fix it, without any user losing data, or anyone needing to reset his password, so aside from the few colleagues who read this blog, none should be the wiser.

I’m not writing this because I want to cry fool and blame the crash on me or someone, or want to show off how fast I was able to fix it, no, I’l writing this because today I learned three thing:

  1. If you plan to fix it, fix it, and don't hope you can let it roll for a few weeks. It'll probably crash.
  2. OS X is such a beautiful system. Doing what I did today today shouldn't be possible. But somehow it was, and everything works fine now in a very shot time.
  3. I love me job.

Tomorrow will be a day of cleaning up the last pieces, and bringing the other servers back up to date, but our main set is up and running again.

Do. Or do not. There is no try.