What is Software Composition Analysis (SCA)?
Today, it is rare for organizations to develop 100% original software code when developing custom software applications. Software developers regularly use open source software (OSS) components and 3rd party frameworks that are freely available across the internet to dramatically speed up the development process and reduce time to market. In fact, over 70% of software applications utilize open source components.1
However, using open source software introduces some serious open source risks into software applications including:
- Common Vulnerabilities & Exposures (or CVEs) that create a security risk
- Intellectual Property (IP) and open source licensing requirements that create legal risks
- Obsolete software components that create operational risks
To put this into context, 60% of enterprise codebases contain at least one security vulnerability due to 3rd party components and 41% of GitHub projects are published using licenses that present potential legal risks.2 Historically, organizations tried to manage this open source risk manually using spreadsheets and documents to track all the open source components used by their developers. However, this can quickly become unmanageable especially in organizations that have hundreds or thousands of applications potentially using tens of thousands of open source components. There was a need to automate this analysis and management process. A category of open source risk management software products emerged called Software Composition Analysis (SCA) products to help organizations manage open source risk.
Software Composition Analysis (SCA) Defined and How it Works
Definition: A category of software products that analyze custom-built software applications to detect embedded open-source software and automatically identify security, licensing, and operational risks.
SCA products typically work as follows:
- An analysis engine automatically scans through software source code and all the associated build artifacts used to compile a custom software application.
- The engine detects OSS components and the version of each in use, identifying the “composition” of the software application. This data is logged often in a database to keep track of the catalog of OSS in use.
- This catalog is matched against data sources to identify specific metadata about each component detected such as known security vulnerabilities in the component, the licensing requirements for using the component, the age of the component, and more.
- All this data is then presented to end users in various forms that may include report documents, online dashboards, and other formats.
SCA in Organizations Today (Typical Use Cases and Scenarios)
SCA is utilized in many enterprises today by various divisions and teams including the Chief Information Officer (CIO) organization, the Chief Technology Officer (CTO) organization, Chief Enterprise Architects (EA), Open Source Governance & Compliance teams, Chief IP / Compliance officers, Chief Information Security Officers (CISO), and more. There are several common use cases including:
- Security Risk Management (vulnerability management): SCA products typically identify and report known security vulnerabilities (CVEs) that are tracked in the National Vulnerability Database (NVD). Some SCA products will also reference additional public sources of vulnerability data beyond the NVD or even have their own proprietary knowledgebase of vulnerability data.
IP / Legal Compliance (open source licensing): Although most OSS components are free to download and use, that does not mean they come without any strings attached. Most OSS components include a license that outlines how the component can be used legally and terms for avoiding software copyright infringement. Some licenses are “permissive” and have minor legal requirements. Other licenses are more “restrictive” and have strict legal requirements on how the component can be used. An example of a potentially risky licensing requirement is called “copyleft” which comes in multiple forms including “weak copyleft” and “strong copyleft”. Licenses with copyleft language may require that if the component is used in a piece of software, then the entire resulting piece of software and its source code must be shared publicly as open source software. For an organization such as an Independent Software Vendor (ISV) that is developing commercial software for sale to its clients, copyleft can present significant legal risk to the company and challenges with open source compliance. There are hundreds of different open source licenses all with their own unique legal requirements. SCA products are often used to detect all the licenses embedded in the OSS components used in their software applications and evaluate the open source licensing legal requirements. Some SCA products will also help interpret the legal requirements of different open source licenses and make recommendations on addressing the risks to help ensure open source compliance.
- Obsolescence (deprecated components): Open source software is developed and maintained publicly by a community of developers. It is usually published in popular open source repositories such as GitHub, Maven, PyPi, Nuget, and many others. Some popular OSS components have a very active community with multiple versions released every year. On the other end of the spectrum, there are components that may no longer be actively supported by the community if a competing component becomes more popular, for example. In this case, these deprecated components may become obsolete and if an organization continues to be dependent on an obsolete component, they run significant operational risk as there will likely no longer be any newer versions released. SCA products are often used to discover potentially obsolete components that are no longer being supported so that organizations can decide whether to transition to a different component that is being supported more actively.
- Technology Due Diligence: A common practice prior to a corporate Merger & Acquisition (M&A) transaction is to perform technical due diligence on the target firm to better understand the state of its information technology and any potential risks to the deal based on this information. Advisory firms are often hired to perform this technical due diligence as they specialize in this area. More and more technology due diligence is now incorporating an SCA assessment as part of the process to identify security, legal, and obsolescence risks within the software of a target firm that could have a financial impact on the M&A transaction. SCA products are often used, especially those that can be deployed rapidly and provide accurate results in as little as a few days due to the typically aggressive timelines of these technical due diligence assessments.
Software Bill of Materials (SBOM): Although more of an output format than a full use case, the creation of a Software Bill of Materials (SBOM) is a common scenario for SCA that supports open source software governance requirements. SBOMs are detailed catalogs of all the open source components and associated attributes included in a software application. Software vendors are often required to deliver an updated SBOM along with any new version of an application they deliver to a client. Some public sector entities such as the US Federal Government require their vendors to include an SBOM with any software delivered to one of their agencies by a vendor. SCA products are used to generate updated and accurate SBOMs in various formats to meet the needs of software sellers and buyers. Some SCA products support industry standard formats (such as CycloneDX and SPDX) that enable more automated consumption of the SBOM by other systems and tools.
SCA Technologies and Tools
The capabilities of different SCA products available on the market today vary quite dramatically. However, there are common features that are expected to be foundational to any viable SCA product such as:
- Automated analysis of software source code, dependency files, and other code artifacts
- Automated detection of indirect references to OSS components and transitive dependencies within code and artifacts
- Identification of CVEs existing in the codebase
- Identification of the legal licenses in use within detected OSS components
- SBOM generation
- Reporting of the SCA data in various formats
More innovative SCA products include advanced capabilities such as:
- Rapid deployment technology that can generate results across many applications in a few days
- Proprietary component databases that can contain upwards of billions of component records
- Additional vulnerability data sources beyond the NVD
- Automated recommendations on where to focus attention across application portfolios that prioritizes the most business critical applications with the most serious risks
- Automated and instant notifications when new risks are detected within an OSS component
- Automated recommendations on safer component versions to adopt when critical CVEs are detected
- An open source license manager that automatically interprets the legal requirements of open source licenses
- SBOM generation in multiple formats for different types of users including Excel, Word, PowerPoint, XML, CycloneDX, and others
2Computerworld and CAST Research