Right now, when encoding an NPM package in CycloneDX (or Syft JSON), syft will make a name like @octokit/core
for @octokit/core - npm. However, other tools, like npx -y @cyclonedx/cyclonedx-npm
, will write this component with the name core
and the group @octokit
. (Both tools make the correct PURL.)
This causes issues because when Syft decodes the cdx JSON, it doesn’t know to rebuild the compound package name @octokit/core
, and we end up with an NPM package just called core
which doesn’t tell us anything and is wrong. (This in turn causes incorrect matching in Grype, e.g. @jridgewell/gen-mapping incorrectly attributed GHSA-8rmg-jf7p-4p22 · Issue #1886 · anchore/grype · GitHub).
One of the general tenets of Syft’s SBOM exporting is that we should be able to do a cycle of export / import / export and end up with the same SBOM we started with. In order to keep this tenet, should we make Syft always write @foo/bar
style npm packages using name + group in CycloneDX JSON?
(This discussion was previously started at Syft is dropping the "group" field from imported CycloneDX · Issue #1202 · anchore/syft · GitHub where it stalled out years ago, so I wanted to restart it and see if the new venue would help.)
2 Likes
It’s also worth asking about how JARs should be described, since they also have a group property.
Right now, running syft -o cyclonedx-json spring-core-5.3.0.jar
produces a component in the SBOM with a PURL like pkg:maven/org.springframework/spring-core@5.3.0
but no group
field. This also seems incorrect.
I think one of the outcomes of this discussion might be a list of package types that use the group
field in CycloneDX.
Here’s a link to the CycloneDX spec on the subject: CycloneDX v1.6 JSON Reference
It seems to me to clearly imply that the Group ID for JARs should be in there:
The grouping name or identifier. This will often be a shortened, single name of the company or project that produced the component, or the source package or domain name. Whitespace and special characters should be avoided. Examples include: apache, org.apache.commons, and apache.org.
Besides NPM and Maven, are there other package managers that are namespaced into groups?
I think there shouldn’t be a problem with the encode-decode if we do a specific thing for a specific ecosystem. In other words: for NPM we know that a name of the form @some/thing
is a group and name. We know for Java Maven has a groupId
and an artifactId
, which correlate to these CycloneDX fields. So we can special case these package types to extract the group on the way out and just reverse it on the way in, right? I don’t really see how it’s any more contentious than that.