July 31st | Open Source Gardening | Live Stream

This is a technical summary of a live stream where we discussed improvements to the open-source tools Syft and Grype. The session focused on triaging and investigating user-submitted issues from GitHub.

Announcements: URL and Installer Improvements

The session began with a discussion of recent improvements to the installation process for Syft and Grype.

  • New CDN for Install Scripts (2:12): The team addressed reliability issues with the installer script URLs. Previously, the scripts were hosted on

    raw.githubusercontent.com, which sometimes resulted in download failures. To resolve this, the installer scripts are now hosted on a more reliable, shorter URL fronted by Anchore’s own Content Delivery Network (CDN), using Cloudflare R2 for storage.

  • Sudo Command Addition (3:15): The installation command provided in the project’s README now includes sudo. This change was made because installing to /usr/local/bin typically requires elevated privileges, and adding sudo makes the copy-and-paste instructions work correctly for most users out of the box.


Issue Triage: Python Circular Dependencies (Syft #4105)

The team’s first deep dive was into a user-reported issue regarding a circular dependency in a Python project using Poetry.

  • Problem Description (6:05): A user opened Syft Issue #4105, reporting that when scanning a project with a poetry.lock file, Syft generates an SBOM where a package (build) is listed as a dependency of itself. This creates a circular reference that can cause problems for downstream tooling that consumes the SBOM. An NTIA validator, for instance, flags this as a warning.

  • Investigation and Root Cause (9:06):

    • Upon inspecting the user’spoetry.lock file, the team identified the source of the self-reference.

    • The build package defines a package.extras section for handling optional dependencies. Within this section, the test extra explicitly lists the build package itself as a dependency.

    • The team concluded that Syft was correctly interpreting the poetry.lock file. The file does, in fact, define a self-referential dependency for a specific context (testing).

  • Discussion and Proposed Solution (14:30):

    • The core issue is a loss of nuance. The SBOM graph correctly shows that build depends on build, but it loses the context that this dependency only exists for the “test” extra.

    • Currently, Syft does not have a global configuration to exclude test or development dependencies, so it includes them by default.

    • Short-term: Chris planned to respond to the user on GitHub, explaining the findings and asking for more details about the specific problems the circular dependency causes in their workflow.

    • Long-term: The team discussed enhancing Syft’s internal data model to capture this extra context (e.g., the dependency type or scope) on the relationship “edge” itself. This would allow for a more accurate representation in Syft’s native format and potentially in other SBOM formats like SPDX, which has mechanisms for defining dependency scope (e.g.TEST_DEPENDENCY_OF).


Issue Triage: Missing OS License (Syft #4099)

Next, the team addressed an issue regarding missing license information for the base operating system in a container image.

  • Problem Description (39:05): A user filed Syft Issue #4099, pointing out that when scanning a Google Distroless container, the resulting SBOM component that represents the OS (distroless-debian) has a null license field. The user suggested it should be Apache 2.0, as that is the license for the Distroless project itself.

  • Investigation and Discussion (40:02):

    • The team clarified that Syft creates a “synthetic” package to represent the OS during SBOM generation; it is not treated like a standard package found by the catalogers. The license detection logic, therefore, isn’t automatically applied to it.

    • A key challenge discussed is that an operating system distribution does not have a single license. It is an aggregation of thousands of components, each with its own license (e.g., the Linux kernel is GPL, systemd is LGPL, etc.). The license of the “recipe” (like the Distroless GitHub project) is not the same as the license of the final artifact it produces.

  • Resolution/Next Steps (48:45):

    • The team concluded that the user’s primary problem is likely that their tooling requires every component in an SBOM to have a non-null license field.

    • They decided the best course of action is to reply to the user to clarify their exact expectation. Is the request to find a single, definitive license for the OS, or is the underlying issue that Syft should be cataloging all the individual packages within the OS distribution, each with its respective license?