June 26th | Open Source Gardening | Live Stream

:wave: Hello again everyone!

We’re back with the Anchore Open Source team running a live stream to discuss issues, pull requests, and future roadmap planning in our SBOM and vulnerability tools.

:alarm_clock: Starts at 2025-06-26T19:00:00Z for about an hour.

Expect engineering and project management discussions, a bit of GitHub issue gardening on Syft, Grype, and the rest of the family.

Join us today for a relaxed, educational, and productive live stream.

Topics

Here’s a summary of the topics the open-source team discussed during our Open Source Gardening live stream on June 26th. We covered a few interesting issues related to SPDX, OS detection, and resource cleanup.

Grype #2738: package location is not available in the json result

The team kicked off with a complex, multi-faceted issue regarding missing package location information in Grype’s JSON output when scanning an SPDX SBOM.

The discussion highlighted a fundamental challenge: SPDX can be a “lossy” format. When Syft creates an SPDX SBOM, it includes rich location data (evidence of where a package was found). However, it encodes some of this information in a comment field with the text “evident by” because there isn’t a perfect field for it in the SPDX specification. When Grype consumes an SBOM, it doesn’t know how to parse this custom comment back into the structured locations field, leading to the information being lost in translation.

The conversation then explored whether this parsing logic should be added to Grype or if it should be handled upstream in Syft. The team debated the semantic meaning of a “location”:

  • In Syft, it typically means “evidence for this package was found at this path” (e.g., a package manager database file).
  • In Grype, it’s useful to show the user all relevant file paths, including the actual vulnerable binary, not just the package database.

Outcome:
The team concluded that the best approach is to enhance Syft’s SPDX parser to intelligently find and populate the locations field from common relationship types found in SPDX SBOMs generated by other tools. This would make Syft’s representation of third-party SBOMs more complete, which would then naturally be inherited by Grype. Alex Goodman will write a summary of the proposed solution on the GitHub issue.

Syft #3955: Overriding / skipping OS detection

Next, the team discussed a feature request to allow users to override or skip OS detection in Syft. This is useful in cases where a user is scanning a directory that doesn’t contain the necessary OS release files, but they know what the OS is and want to provide that information manually.

This issue brought up a deeper architectural point about how Syft is designed. Currently, OS detection runs as a separate initial task, but this information isn’t efficiently passed to subsequent “catalogers”. As a workaround, individual catalogers that need OS information often re-run the detection themselves. While the detection is fast, this is not an ideal design.

The user’s proposed solution involved a breaking change to a core interface, which the team wanted to avoid.

Outcome:
The team agreed that the user’s goal is valid. The plan is to explore adding a new flag (e.g., --distro) to allow users to specify the distribution, making the feature opt-in and avoiding any breaking changes. The issue has been moved to the “ready” column for a team member to pick up.

Syft #3985: Clean up downloaded image from daemons

This discussion centered on a pull request for stereoscope (the library Syft uses for container image analysis) to automatically clean up container images that it pulls to a local Docker daemon. Currently, if Syft needs to pull an image to scan it, that image remains in the local cache, requiring manual cleanup.

The main question was whether this cleanup should be the default behavior or opt-in.

  • Will Murphy made a strong case that deleting resources without explicit user consent is surprising and violates the principle of least astonishment. He drew a parallel to the docker run command, which requires an --rm flag to clean up the container after it exits.
  • Dan Nurmi added that cleanly reversing a docker pull is complicated due to shared image layers, and a naive deletion could unintentionally break other images.
  • The team also acknowledged that the Docker daemon can be a shared, multi-tenant resource, and automatically deleting an image could interfere with other processes.

Side-bar: Dan also pointed out that users who want this ephemeral behavior might be better served by using Syft’s registry: source (e.g., syft registry:alpine:latest). This method pulls image data directly from the registry without ever involving a local Docker daemon and cleans up after itself automatically.

Outcome:
There was a unanimous decision that the cleanup behavior must be opt-in. The team settled on adding a flag, similar to docker run --rm, to enable this. The feature will also be designed to only remove images that Syft itself pulled.

Syft #3968: CLI Option to Control Scan Depth

The final topic was a feature request to control the “depth” of dependency scanning, specifically to allow users to generate an SBOM with only direct dependencies and exclude transitive ones. The user cited compliance use cases as the motivation.

This sparked a fascinating discussion on the nuances of dependency relationships.

  • The team questioned the compliance angle, noting that transitive dependencies are still part of the final deployed artifact and therefore still represent a security responsibility.
  • A more practical reason for this feature is to reduce the size of the generated SBOM. Alex Goodman compared it to the existing options for excluding file metadata, which some users disable to keep their SBOMs lean.
  • Dan highlighted the core challenge: the concept of “direct” vs. “transitive” is only clear when there is a single, well-defined root application. In a container image, there are often multiple “roots” (e.g., the OS packages, a Python application, and a Node.js utility could all be considered separate dependency trees), making it ambiguous what a “direct” dependency is.

Outcome:
The team concluded that they need more clarification from the user to fully understand the use case. The request is more complex than it first appears. Alex plans to follow up on the issue to ask for concrete examples of the desired output for different scan targets (like a container image vs. a JAR file) to better understand the user’s goal.