May 8th | Open Source Gardening | Live Stream

:wave: Hello everyone!

We’re back with the Anchore Open Source team running a live stream to discuss issues, pull requests, and future roadmap planning in our SBOM and vulnerability tools.

:alarm_clock: Starts at 2025-05-08T19:00:00Z for about an hour.

Expect engineering and project management discussions, a bit of GitHub issue gardening on Syft, Grype, and the rest of the family.

Join us today for a relaxed, educational, and productive live stream.

Topics

Hey everyone!

Here’s a quick rundown of what the Anchore Open Source Team discussed during our live “Open Source Gardening” session on May 8th. We went through a few issues and pull requests for Syft and Grype.


Consider using AI tools (e.g. CodeRabbit) for PR reviews

  • Discussion: Alan Pope (popey) brought this issue up for wider discussion as not everyone was present in the previous week’s stream when it was initially touched upon. The issue, filed by a community member, suggests using an AI tool like CodeRabbit for initial pass reviews of pull requests. The idea isn’t to replace human review entirely but to augment it.
  • Team Insights & Concerns:
    • Chris Phillips (spiffcs) had previously voiced concerns that relying on an AI tool might lead to contributors being less conscientious, knowing an LLM would pick things up. There’s also the aspect of LLMs potentially being used to create the code in the first place, leading to LLMs reviewing LLM-generated code.
    • Alex Goodman (wagoodman) had volunteered to research integrating CodeRabbit with a toy repository to test its capabilities. This was also tied to another task of potentially swapping out Dependabot for Renovate Bot, both of which would benefit from testing in a non-production environment first. This discussion about testing in a toy repo happened during an internal team standup, not on the previous live stream.
    • Alan mentioned he might try CodeRabbit (which is free for open source projects) on one of his personal “nonsense” open source projects to see how it works.
  • Outcome: Alex Goodman (wagoodman) will add a comment to the GitHub issue summarizing the plan to test CodeRabbit in a toy repository first. This will allow the team to evaluate its usefulness before considering it for core Anchore projects.

False positive for PHP pecl extension redis

  • Discussion: Will Murphy (willmurphyscode) brought this older Grype false positive up for discussion.
  • Technical Deep Dive - CPEs and Matching:
    • Will explained the core problem lies with how NVD (National Vulnerability Database) uses CPEs (Common Platform Enumeration) for naming software. CPEs often don’t specify the language ecosystem or package format, dating back to a time when software was more monolithic (e.g., “Microsoft Windows” or “Sun Solaris”).
    • The issue arises when a product name is generic. For example, “Docker by Docker Inc.” might have a CPE like cpe:a:docker:docker. However, a Python client for Docker, also often named “docker” and potentially attributed to “Docker” as the vendor (if the metadata is pulled that way by Syft), could incorrectly get matched with vulnerabilities for the Docker runtime itself.
    • This specific issue is about a PHP pecl extension for Redis. Syft identifies it as a package named “redis”. Grype then generates a CPE like cpe:a:redis:redis (vendor: redis, product: redis). This then matches vulnerabilities for Redis the server, not the PHP extension, leading to false positives.
    • Grype developers prefer to match against GitHub Security Advisories (GHSAs) for language ecosystems (npm, RubyGems etc.) because GHSAs are more specific and avoid this CPE ambiguity. However, GHSAs do not cover PHP pecl native extensions.
    • Therefore, for these PHP extensions, Grype falls back to its least preferred matching method: CPE matching against NVD, which leads to this class of false positives.
  • Proposed Solution/Discussion Points:
    • Will proposed considering special casing for extremely well-known CPEs like cpe:a:mysql:mysql or cpe:a:redis:redis. The idea is that if NVD lists such a generic CPE (vendor and product are the same, all other fields are wildcards), it’s almost certainly referring to the main server product, not one of many client libraries in various languages.
    • The question is whether Grype should assume that a non-binary package (like a PHP extension) should not match against these “canonical” server CPEs.
    • Alex Goodman (wagoodman) pointed out that Syft does have a mechanism (a hardcoded list) for excluding certain language ecosystem packages from getting overly generic CPEs. For example, there’s a rule: “do not make a Python package called redis with the CPE redis:redis”. This is because pip install redis is common, and these false positives were frequent.
    • It was realized that this existing exclusion mechanism in Syft (found in syft/syft/pkg/cataloger/internal/cpegenerate
      /candidate_by_package_type.go
      ) is likely the correct place to fix this. The team just needs to add an entry for PHP pecl packages, specifically to prevent php-pecl packages named redis from being assigned the generic redis:redis CPE.
  • Side-Bar on Syft’s CPE Generation & Future Ideas:
    • The team discussed how Syft sometimes has to “guess” CPEs when metadata is poor.
    • There’s an ongoing idea to have Syft use a database at runtime for more accurate identification, perhaps a key-value lookup of JAR digests to actual Maven records, to reduce guessing.
    • A similar idea could be a list of known problematic/overly general CPEs that Syft should handle more intelligently.
    • Alex also mentioned the challenge of CPEs where the target software field in NVD is a wildcard (*). If Syft generates a CPE with a specific target software (e.g., php), Grype currently still matches it if the NVD record has a wildcard there. There was a debate during GrypeDB v6 development about whether a more specific query (from Syft) should narrow results even if the database record (NVD) is broader. The current behavior is that it does match.
  • Outcome:
    • The team concluded that the existing CPE exclusion mechanism in Syft is the right place to address this specific false positive for the PHP Redis extension.
    • Will Murphy is to investigate if adding an exclusion rule (similar to the Python one for Redis) in Syft for PHP pecl packages named ‘redis’ resolves the issue.
    • The issue will likely be marked as a “good first issue” once the path forward is confirmed, as it involves adding an entry to an existing list.

Posible false positive detection - CVE-2022-1271 - gzip - Ubuntu 22.04

  • Discussion: Alan Pope brought up this issue. Alex Goodman immediately recognized it.
  • Problem: This issue was related to how Syft handled symlinks, particularly with the “user merge” in some Linux distributions (where /bin might be a symlink to /usr/bin or vice-versa). Syft wasn’t correctly following symlinks in parent directories when associating files with their Debian packages. This meant a binary like gzip might not be correctly identified as being owned by the gzip Debian package if its path involved such a symlink.
  • Outcome: This specific symlink issue in Syft (where a package claims a path that, due to symlinks in parent directories, wasn’t being correctly resolved to the real path) was fixed a few weeks ago.
    • However, Alex mentioned that fixing this revealed an inverse case: when two packages exist, one writing the actual binary and another writing a symlink to that binary. With the fix, Syft might now report both packages as owning the same underlying file. The ideal behavior is that only the package providing the actual binary (not the symlink) should be considered the owner. This inverse case still needs to be addressed.
    • For the original issue, Alex will confirm it’s fixed by the previous Syft update and then create a PR to bump the Syft version in Grype, which should close this Grype issue.

GraalVM VM and Oracle JDK - Syft Issue 3762

  • Discussion: Alan Pope presented this Syft issue. The issue report includes a link to a PR in the user’s own fork, not against the main Anchore Syft repository.
  • Problem: It appears to be a version parsing problem in Syft related to GraalVM. The user’s PR seems to implement changes to correctly detect the GraalVM version.
  • Outcome: Alex Goodman will comment on the user’s PR in their fork, suggesting they open it against the upstream anchore/syft repository if they are ready for it to be reviewed and potentially merged.

The team decided to wrap up as the triage board was clear of immediately actionable items for the stream.

Thanks for tuning in!