The Software Package Data Exchange (SPDX) specification defines an open standard for communicating information about software components. SPDX is used to create Software Bill of Material lists (SBOMs), encapsulate licensing and copyright details, and provide package metadata such as version identifiers and known vulnerabilities.
SPDX was originally designed over a decade ago as a way to help developers comply with open-source licenses. Since then it’s been extended with new capabilities for describing dependency trees and issuing SBOMs. SPDX was catapulted to global attention in September 2021 when ISO recognized it as the international standard for software supply chain documentation.
Who Creates SPDX?
SPDX is a standalone project that’s managed by the community-run Linux Foundation. The current standard has been supported and authored by interested parties from across the software industry. The list includes big names like Google and Microsoft as well as the developers of adjacent tools such as Anchore and Snyk.
SPDX is the culmination of these vendors’ experience of managing, documenting, and maintaining software supply chains at scale. The project is open to outside participation though – any individual or company that works with SPDX artifacts can join the monthly general meeting or subscribe to the mailing list.
What Is SPDX For?
SPDX is an industry-standard way to describe software packages and their dependencies. It’s meant to work across vendors, programming languages, and frameworks. The standard reduces the effort involved in producing an SBOM for a project by abstracting the differences between package formats and increasing interoperability across the ecosystem.
SBOMs are still a new concept to many developers. A natural place to start is in the package manager lockfiles you already have. Yet simply copying and pasting your package.json into a documentation file isn’t a resilient solution. What happens when you’ve also got a composer.json for your backend and a requirements.txt for that standalone Python component? Now you’ve got three independent package sources to audit, check license compliance, and obtain vulnerability lists for.
SPDX provides a unified way to structure, store, and query this information. The specification defines an unambiguous format for communicating any software package’s metadata. The key here is in that “software package” vocabulary: we’re not talking about NPM packages or Ruby gems, instead taking a higher-level view of the development landscape.
An SPDX package definition includes data such as the component’s name, version, author, and license. Packages can reference other packages to define their dependency trees. The format also supports file-level data to describe a package’s content.
Creating SPDX-Compatible SBOMs
Tooling around SPDX is still emerging. A prominent community project is the Open Source Software Review Toolkit which can generate SPDX metadata from most common package manager formats. The Open SBOM Generator is an alternative that’s more narrowly focused on SPDX-compatible SBOM generation.
You’ll also find SPDX support in many popular dependency list generators and vulnerability scanners. Once you’ve got an SPDX-formatted file, it can be fed into other tools and libraries to conduct further analysis.
The SPDX Project is developing SPDX client libraries for Java, Python, Go, and JavaScript. These will make it easier to consume SPDX data within your own applications. There’s also a website that offers SPDX parsing, validation, and comparison functions.
Why Does All This Matter?
Prominent supply chain attacks have exposed the fragility of modern software development. It’s too easy to end up with sprawling dependency trees referencing tens or hundreds of thousands of distinct packages. A vulnerability in any of these could threaten your application, even if the code you’ve written yourself is watertight.
The challenges of software supply chains were directly referenced by the U.S. executive order on cybersecurity in May 2021. Now it’s imperative that open-source maintainers, downstream users, and software-driven organizations work to provide more visibility into the components that comprise the world’s software. A common format for producing, sharing, and analyzing SBOMs is an important part of the solution.
Having SPDX accredited as an ISO standard gives the industry a collective reference point. It’s a specification that could reasonably be mandated as a required deliverable for future software projects. Government agencies, regulated industries, and security-conscious clients might demand SPDX artifacts so they can be consistent in how they audit and secure their supply chains.
Cross-industry SPDX adoption would let organizations be more confident in the security of new software they procure. Identifying whether systems are impacted by critical zero-day vulnerabilities would be a case of consulting the SBOM, instead of manually inspecting the various package manager formats used in a project.
While the earliest versions of the standard have been around for a decade, only recently has the move towards automated SBOMs, compliance, and dependency indexing begun to gain momentum. Becoming an ISO standard might prove to be the tipping point where developers adopt SPDX at scale.
Conclusion
SPDX is now the way to define software supply chains. It’s a format that’s designed from scratch to express the relationships between software packages. It can be used to produce a bill of materials, validate license compliance, and determine authorship and ownership.
With scrutiny of software supply chains only getting more intense, the SPDX ecosystem is something developers, operations teams, and even legal teams can expect to see a lot more over the next few years. It provides a way to navigate the relationships between software components and the constraints that are placed around their use.
Having a standardized way of providing this information gives the industry something specific to aim for when creating SBOMs and auditing supply chains. SPDX addresses the ambiguity that’s prevailed to date, offering a tested and robust cataloging format to replace organization-specific package indexes and spreadsheets.