As a nursing home consultant and her company’s top mentor for clients that have fallen afoul of their states’ regulatory commissions, my wife travels nearly every week of the year. As a result, I have a ton of second-hand experience with airline delays and passengers being stuck in an airport far from home…especially when she flies one particular airline, which shall remain anonymous (but which rhymes with “You Guess Stair Ways”).
So when Susan Carey at The Wall Street Journal reported last week on the recent spate of computer systems outages among U.S. air carriers, my initial reaction was, “so what else is new?"
But as I read more about these issues, I began to notice that they seem to bear a significant resemblance to the financial systems outages that occurred back in February and March. This resemblance became even more pronounced when Carey pointed to a major culprit behind the outages, noting:
The problems, often arcane, can be caused by bad hardware, corrupted software, the failure of backup systems to kick in or human error. Electric power supplies can go on the fritz, and so can telecom networks connecting internal airline operations with airports and data centers.
Experts say the disruptions often occur when an airline or technology vendor is performing maintenance, installing an upgrade or making a major technology transfer.
Haven’t we been here before?
The June computer outages about which Carey wrote resulted in hundreds of flights being cancelled, twice or more that number being delayed and tens of thousands of passengers – my wife included – stuck in some airport feeling very inconvenienced. And while some airlines, like Alaska Airlines, move quickly to compensate passengers for their inconvenience, one has to wonder how these already put-off customers would feel if they knew their hassle had been brought on by a potentially avoidable problem.
As Carey noted, many of the problems stem from existing software that has either been upgraded, customized or had new versions built on top of it. Power supply issues notwithstanding, the vulnerabilities and stability of old software should not be the reason that tens of thousands of people across the country should be stuck eating airport food or sleeping overnight on benches.
Truth is, software failures like the ones experienced by the airlines have become all too commonplace in all industries. We treat news of software failures as though they were inevitable and almost expected and, particularly with an industry like the airlines, we accept the apologies that are granted because we have no other alternative than to do so.
But why? When exactly did we decide that software failure was an unavoidable part of business and an acceptable excuse to leave us stranded hundreds or even thousands of miles from home?
Like we’ve said in this space in the past about other industries, the airlines need to do a better job of assessing the structural quality of software before it is deployed rather than waiting for it to fail and then fixing the problem and apologizing for it. It’s not like they don’t know what causes poor software quality:
A quick application of automated analysis and measurement to diagnose the structural quality and health issues within the application software of the airlines’ systems would go much further toward making the skies a friendlier place to fly…and get my wife home faster, too!