Spolsky gives the excellent example of server downtime in comparison to airplane crashes:
Measuring the number of minutes of downtime per year does not predict the number of minutes of downtime you'll have the next year. It reminds me of commercial aviation today: the NTSB has done such a great job of eliminating all the common causes of crashes that nowadays, each commercial crash they investigate seems to be a crazy, one-off, black-swan outlier.As time goes on, an increasing proportion of problems derive from rare events. The high-frequency events at the hump of the frequency distribution get "swept out", along with any early occurring rare problems from the tails. This frees up attention for the truly bizarre and unforseeable events. Note that this doesn't work well if your Black Swans are catastrophic. If you get hit by a civilization ending asteroid or a spontaneously business-ending event, you have little opportunity (or benefit) to learning from experience, but this kind of problem is thankfully extremely rare in our collective prior history.
No comments:
Post a Comment