How Structural Defects Cause Mayhem – A Software Engineer’s Point of View

by

Software engineering practices have changed a great deal over past couple of decades. CIOs and software architects are ever more concerned about the nonfunctional aspects (robustness, security, maintainability, transferability etc.) of software applications as much as they are concerned about the functional counterparts. While these are more strategic in nature, in this article, I thought of sharing my own experience - an engineer’s view of how a badly written code causes software engineering mayhem.

It was back in 2008. Having completed my graduation, I joined one of the most prestigious organizations as a software engineer. My first assignment was to work on a business-critical IT system - a system used by more than 10,000 users on a daily basis. A critical defect in this system was a major pain point for all the stakeholders (from an engineer to the CIO) as it directly impacted business. A few days into my first assignment, I overheard from my colleagues that the system was not in great shape, that of late there were multiple performance glitches reported by users. I could sense the tension all around and was wondering whether it was a punishment being assigned on the project!

One month down the line, when I was still trying to get hold of things, I came across the monster - a priority one defect in production. I finally came to know what a priority one defect means. It was complete chaos. People were running around here and there. Leads, Managers, Architects were on never ending calls. People tried hard to fix the issue for a couple of days with no luck. It was not getting recreated in the test environment. The situation was scary. After a couple of days in one evening, I along with three more colleagues were called by my business unit head in his cabin. He just said:

“There is a production defect. I want you to fix that. I am dividing you into two teams. One will work on normal shift, transfer the knowledge to the other team, who will work at night. The only way to get out of it is by fixing the issue.” Unfortunately, I was chosen to work at night and can’t really describe what a nightmare it was for me.

Five 24-hour shifts later, we finally figured out the issue. It was due to a bad code. Somebody had written a database query within a loop, which was causing a DB lock and it was only happening in a very rare case. We had to change the entire logic and that took a lot of time to fix the issue (5 days for a production P1 defect is quite huge). Subsequently, we worked very hard and were able to fix all the performance issues on time, most of them were due to bad coding practices. The system became stable within 6 months or so. That was the start of my professional career.

Over the past decade, a lot people across the globe have experienced similar kind of horror in their career. This also underscores the reason why nonfunctional improvement has become one of the most important topics of discussion among the software engineering practitioners. For good, we have improved. There are a few software intelligent tools available in the market which can identify such system-level structural defects. These tools, along with helping CIOs to protect and improve their IT infrastructure, are surely making life more comfortable for all software engineering practitioners.

Tagged:
Get the Pulse Newsletter  Sign up for the latest Software Intelligence news Subscribe Now <>
Open source is part of almost every software capability we use today. At the  very least libraries, frameworks or databases that get used in mission critical  IT systems. In some cases entire systems being build on top of open source  foundations. Since we have been benchmarking IT software for years, we thought  we would set our sights on some of the most commonly used open source software  (OSS) projects. Software Intelligence Report <> Papers
In our 29-criteria evaluation of the static application security testing (SAST)  market, we identified the 10 most significant vendors — CAST, CA Veracode,  Checkmarx, IBM, Micro Focus, Parasoft, Rogue Wave Software, SiteLock,  SonarSource, and Synopsys — and researched, analyzed, and scored them. This  report shows how each measures up and helps security professionals make the  right choice. Forrester Wave: Static Application Security Testing, Q4 2017  Analyst Paper
This study by CAST reveals potential reasons for poor software quality that  puts businesses at risk, including clashes with management and little  understanding of system architecture. What Motivates Today’s Top Performing  Developers Survey
Arka Chakraborty
Arka Chakraborty Global Product Marketing Manager
A global product manager, strategic marketing leader and an IIM Calcutta alumnus, Arka has vast expertise in end-to-end product management on a global scale.
Load more reviews
Thank you for the review! Your review must be approved first
Rating
New code

You've already submitted a review for this item

|