When dealing with Software Analysis and Measurement benchmarking, people's behavior generally falls in one of the following two categories:
As often, there is no sensible middle ground.
Some of the most common reasons for objecting to comparing SAM results are:
All of the above reasons make perfect sense and one has to be well aware of such situations. However, this is no ground for dismissing any possibility of comparison altogether.
A well-designed measurement model can help overcome these challenges. Indeed, with a measurement model which aligns on a Goal-Question-Metric approach, the three levels act like a built-in clutch.
How so? Because the Metrics required to answer the Question one has to ask themselves to measure the level of achievement of each Goal can differ from one technology stack to another, or evolve from one release of the measurement platform to another without invalidating the results. This is also true about the number and nature of the Questions one has to ask themselves to measure the level of achievement of each Goal.
Then, with a measurement model using a compliance ratio to consolidate and aggregate (as opposed to raw count of non-conformity), this statistical processing acts as a built-in clutch.
At the Question level, even if the number of contributing Metrics differ, capabilities in different technologies differ and the sources of issues as well, hence a different number of Metrics. However, if the Question is well answered by the fewer or larger number of Metrics, the difference is then normal and acceptable.
At the Goal level, even if the number of contributing Questions differ: the difference is inherent to the technology and is therefore normal and acceptable.
For example, there is no possibility to over-complexify COBOL code with coding and architectural practices related to Object-oriented capabilities, while there are many ways with JEE. This comparison might seem unfair to JEE that is assessed with more rules and technical criteria than COBOL, but this is inherent to their respective capabilities and a fair assessment of transferability or changeability risk must take this difference into account.
As this assessment result will guide a decision for resource allocation, it is critical to know that in this JEE component, there is not only some regular algorithmic or SQL or XYZ complexity, but there is an additional load of complexity due to excessive polymorphism or XYZ.
New releases of the measurement system are designed to deliver more accurate assessment results. However, added accuracy will impact results. The use of compliance ratio can help limit the impact, assuming the impacts have to be limited.
Why not rely on a raw count of violations?
Using a raw count of violations would lead to an increase in the number of violations, regardless of the quality level. Any new rule can lead to more violations as it turns invisible quality issues into visible ones, even if the compliance ratio for the new rule is better than the rest of the pack.
This area is perhaps the most delicate because, the ability to compare relies on the assumption that human contextual input differs but that it is fairly set.
To expect fairness may seem naïve, but a lot of the measurement process already relies on some amount of fairness. For example, what are the true boundaries of the application you’re measuring? Nowadays, with cross-app principles, the definition is blurred and it could be easy to omit part of the application to hide some facts from management.
One could also define a target Architecture that would work in their best interest, but that would mean entering into the system some flawed configuration data that anyone can review.
One could vet the libraries to prevent security-related vulnerabilities, improving assessment results, but that would also mean entering some flawed configuration data.
Yes, there can be true differences between the way multiple applications are assessed.
Although, this is no reason to dismiss benchmarking altogether. Not hiding the differences and their impact is important though.
As Dr. Bill Curtis stated during the CISQ (http://www.it-cisq.org/) Seminar in Berlin on June 19th when explaining the key factors to conduct a clever productivity analysis -- always inspect data, investigate outliers and extreme values, and always question the results too.
In other words, always use your brain.
Erik Oltmans, an Associate Partner from EY, Netherlands, spoke at the Software Intelligence Forum on how the consulting behemoth uses Software Intelligence in its Transaction Advisory services.
Erik describes the changing landscape of M & A. Besides the financial and commercial aspects, PE firms now equally value technical assessments, especially for targets with significant software assets. He goes on to detail how CAST Highlight makes these assessments possible with limited access to the targetâ€™s systems, customized quality metrics, and liability implications of open source components - all three that are critical for an M&A due diligence.