Go Beyond Discovering the Package Manager for Your SBOM – The New Stack



Shripad Nadgowda

Shripad is a senior technical staff member at IBM Research. He is passionate about driving research innovations that bring differentiation capabilities to the cloud. His current area of ​​research includes DevSecOps, development tools, and basically anything related to container security.

The security and integrity of the software supply chain is one of the fundamental requirements of the overall cybersecurity assessment. The first step in securing the software supply chain is the ability to provide a complete, accurate, and verifiable record of every dependency built into the creation of a software deliverable product, commonly referred to as a software nomenclature (SBOM).

As stated in a report from the National Telecommunications and Information Administration (NTIA), “Without SBOM, a lack of transparency about the contributors, composition and functionality of software systems contributes significantly to cybersecurity risks and increases development costs, d supply and maintenance. “

There are already a number of open source and commercial tools made available to meet the need, and well-defined specification standards are being established. Some software vendors and developers have started to integrate SBOM generation and distribution as part of their software delivery pipeline. So what are we talking about in this article?

Today, SBOM generation techniques are largely limited to discovering dependencies that are managed and can be queried through package managers. For example, pip list Where dpkg -l or those that are explicitly registered in some package manifests, like package-lock.json.

Developers, on the other hand, are not limited to making dependencies only through package managers. In some cases, the required software dependencies are not available through the package managers. For example, some software distributions are available as a precompiled binary that developers can simply wget, while others are available as raw code in tar.gz that the developers could make && make install.

Laura Luan

Laura is a Software Engineer at IBM Research on IT Resource Management and Automation. His current area of ​​interest is cloud security compliance, specifically tools for automated software inventory, application of best practices, and integration into the DevSecOps process.

This is a critical gap in ensuring the completeness of our SBOM. And we couldn’t find any reliable open source tool to solve this problem, so we decided to create one, and it’s orion.

First of all, in orion we are not trying to duplicate existing SBOM build tools. Orion therefore does not discover any dependency on the package manager. It discovers the dependencies installed via modalities outside the package managers. It is therefore complementary to the existing SBOM generation tools. Additionally, at this time, orion is only focusing on models for generating microservices applications.

That said, now let’s take a closer look at what Orion does.

Orion is available as a CLI that can be run locally or in any CI pipeline, as follows:

For microservices, a software deliverable is typically a container image created through a recipe defined in a Dockerfile. The Dockerfile allows developers to express different models and strategies to build their applications.

Therefore, orion first analyzes the Dockerfile. During analysis, it analyzes commands such as wget, curl, git clone, tar, etc., which indicate the inclusion of some third-party dependencies and create an intermediate “trace” object.

The trace basically contains the provenance information for each dependency – for example, the mapping of the download url to its untar location in the image. At this point we have a record of all software dependencies. But we are missing another very critical detail for these dependencies required for SBOM: the unique identifiable key.

Sample Docker File Template

Fig. 1: Sample Dockerfile template

In some cases, as shown in Fig. 1, these dependencies list the versions of the versions in their download URL or file name. But it is not a reliable or consistent technique. Therefore, for all dependencies, we hash the contents of their file which serves as a key.

Then, when should we calculate the key? One option would be to download these pre-built dependencies while computing the trace and measure the key from the result file (s).

Again, in some cases, developers download dependencies with latest Where stable version tags, which might resolve differently during the actual build. Therefore, orion requires a reference to the final constructed image, which it uses to discover the final unique key. Orion finally outputs the report in SPDX output format.

Currently, another important detail is not yet fully supported: the discovery of licenses for these dependencies. This is again because there is no standard discovery technique that can be used to cover the different ways that these dependencies can be hosted, although we are slowly adding support for a few platforms. accommodation.

It’s worth mentioning an approach we envisioned at the start: “Can we just build this SBOM from an image, without Dockerfile?” It seemed doable, as the images are usually overlaid with a new independent layer for each Dockerfile operation.

But by looking at different building models, we realized the limitations of this approach, especially when developers follow multi-step versions or overwrite all layers of the image to optimize space.

Another drawback of such an approach is the absence of source information from the software making up each layer. Source information can be easily found from the Dockerfile. So we decided to take a multi-pronged approach, with Dockerfile analysis for collecting traces and image analysis for the final artifact ID mapping.

Our mission with this project is to enable a complete and accurate SBOM generation for microservices applications. Orion, in its current state, is just our first step in this direction. We therefore welcome feedback and comments from everyone to make the construction of our software more transparent and responsible.

Photo by Thirdman from Pexels.



Comments are closed.