September 25th | Open Source Gardening | Live Stream

September 25th | Open Source Gardening | Live Stream

:wave: Hello everyone!

We’re back with the Anchore Open Source team running a live stream to discuss issues, pull requests, and future roadmap planning in our SBOM and vulnerability tools.

:alarm_clock: Starts at 2025-09-25T19:00:00Z for about an hour.

Expect engineering and project management discussions, a bit of GitHub issue gardening on Syft, Grype, and the rest of the family.

Join us today for a relaxed, educational, and productive live stream.

Topics

Hello everyone,

Here’s a summary of the topics discussed during our Open Source Gardening stream on September 25th. The team included Dan, Alan, Keith, Chris, and Will. We went through several issues marked with the needs-discussion label.

Syft #2155: Command output to give more information on what catalogers look for

The stream started with a discussion about improving Syft’s output to help users understand what each cataloger is looking for and what it can find. This would help users troubleshoot why a specific package might not be appearing in their SBOM.

  • The Problem: Users are often unsure which files Syft’s catalogers are inspecting to identify software packages. For example, they might know MySQL is in their container, but if the binary is named with a capital ‘M’ (MySQL), Syft’s binary cataloger might miss it.
  • Discussion:
    • Keith explained that most catalogers work by searching for files that match specific glob patterns (e.g., package.json, requirements.txt).
    • However, some catalogers are more complex. The binary cataloger, for example, uses both file path globs and regular expressions to find version strings inside the files. Other catalogers, like the Go binary cataloger, identify files based on their MIME type rather than a filename pattern.
    • The team considered adding a new command like syft catalogers list <cataloger-name> to display this information. However, the information for different catalogers (e.g., the RPM cataloger vs. the binary cataloger) is structured so differently that creating a consistent and user-friendly output would be challenging.
  • Outcome: Will suggested an alternative approach: enhancing Syft’s logging. At a debug or trace log level, Syft could output messages stating what each cataloger is looking for (e.g., “Java cataloger considering files matching glob *.jar”). This would provide the necessary information for troubleshooting without changing the user interface or API. The team agreed this was a promising path. Keith will add a comment to the issue summarizing this proposal.

Syft #2095: Support Built-Using field for deb packages

Next, the team revisited an issue concerning the Built-Using field in Debian packages. This field lists other packages that were used to build the main package, which is particularly relevant for statically linked binaries (like those from Go or Rust) where the code from these build-time dependencies is included in the final product.

  • The Goal: The idea is to represent these Built-Using entries as distinct packages in the SBOM, as they represent software that is present in the container.
  • Discussion:
    • The team confirmed that the entries in the Built-Using field often correspond to other package names within the Debian/Ubuntu repositories. Chris verified this during the stream by running apt cache show, and Alan found a corresponding golang-go.crypto package in the Ubuntu archives.
    • A key challenge is handling potential duplication. For example, if a Debian package contains a Go binary, the Go binary cataloger might find all the Go modules, and the Debian cataloger might find similar information via the Built-Using field. The team wants to avoid creating confusing or duplicate entries.
    • It’s also crucial to create the correct Package URL (PURL) for these new packages. This ensures that when Grype scans the SBOM, it matches vulnerabilities from the correct data source (e.g., the Debian security advisories, not the Rust crates.io advisory database).
  • Outcome: The team agreed that more research is needed to understand the exact nature of the Built-Using field and how to best model it in the SBOM. The issue’s label was changed from needs-discussion to needs-investigation.

Syft #3071: Dependency graph of sboms generated with syft is incomplete due to missing root node

This discussion focused on an issue where SBOMs generated by Syft can be a “forest” of dependency trees rather than a single tree with one root. This causes problems for downstream tools like Dependency-Track, which expect a single root component to render the dependency graph correctly.

  • The Problem: When Syft scans a container image, it finds many top-level packages (e.g., RPMs, Python packages) that don’t have a parent-child relationship with each other. This lack of a single root node makes it difficult for some tools to process the SBOM.
  • Discussion:
    • The team agreed that the most logical “root” for an image scan is the image itself. Will suggested that the image should be represented as a component in the SBOM, and all packages found within it should be its direct dependencies.
    • Keith noted that this could be modeled in CycloneDX by making the scanned source (the image) a top-level component and nesting all found packages underneath it. This effectively represents a CONTAINS relationship.
    • Chris cautioned that this is an “over-approximation” of a dependency relationship. For instance, an application in an image doesn’t truly “depend on” bash, even though bash is contained within the image. However, given the limitations of the CycloneDX format (which primarily models dependencies), this seems to be the most practical approach.
  • Outcome: To make an informed decision, the team needs to better understand the user’s experience within Dependency-Track. Keith volunteered to set up an instance of Dependency-Track, import a Syft-generated SBOM, and analyze the results. This will help determine the best way to structure the output. The issue was moved to needs-investigation.

Syft #4113: debug docker images are running as non-root user

The team discussed an inconsistency in the user configuration of the official Syft and Grype Docker images.

  • The Problem: For security reasons, the default user for the Docker images was changed from root to a non-root user. This change broke common CI/CD workflows that rely on mounting the Docker socket or sharing cache directories, which require root permissions. While the :latest tags were reverted to using root, the :debug tags were left as non-root. This is problematic because the :debug images (which include a shell) are specifically for CI environments (like GitLab CI) where users often cannot change the running user and need root.
  • Discussion: The team acknowledged the user frustration caused by the back-and-forth changes. The goal is to provide a consistent and predictable experience.
  • Outcome: A clear path forward was decided:
    • The standard :latest and :debug tags will both run as root for maximum compatibility with CI systems.
    • New, explicit tags (:nonroot and :debug-nonroot) will be created for users who prefer or require a non-root execution environment.
    • Chris took on the task of implementing this change, and the issue was moved to the ready state.

Syft #4187: add java packages to java archive metadata

The final discussion was about a feature request to add more granular information about the contents of Java Archive (JAR) files to the SBOM.

  • The Goal: The request is to list the Java namespaces (e.g., org.joda.time) found inside a JAR file as part of the package’s metadata. The user wants this information to correlate output from performance profilers (which might flag a specific Java class) back to the JAR file that contains it in the Syft-generated SBOM.
  • Discussion:
    • The team quickly clarified a key point of confusion: the request is about Java’s internal “packages” (namespaces), not nested software “Packages” (like JARs within JARs).
    • A major concern was the potential for this to significantly increase the size of the SBOM, as a single JAR can contain hundreds of Java namespaces.
    • Chris drew a parallel to Go, where the team made a deliberate decision not to list all of Go’s internal package imports for similar reasons (SBOM bloat and potential user confusion). They decided that the Go Module is the correct level of granularity for a software package. The team questioned why they should treat Java differently.
    • Keith suggested a more generic solution: instead of adding Java-specific metadata, perhaps Syft could be enhanced to optionally list the entire file tree inside any archive (JAR, ZIP, etc.). This would provide the requested information in a more versatile way.
  • Outcome: The team felt that more input was needed, particularly from Alex Goodman, who has deep expertise in the Java catalogers. The issue will remain under needs-discussion and will be revisited in a future session.