What are the plans for metadata in the syft format

SPDX and Cyclone DX have field for additional metadata so pass in. My Organisation faces similiar problems they want to solve with the metadata. For Example “tag” the Team that created the SBOM. What are the plans for the syft format on how to solve this. Could a “Tags” Area (that is often seen by cloud providers) be something to consider?

1 Like

Would these tags just be arbitrary strings and get put in a list somewhere in the Syft SBOM and have no other purpose? Do these map to any specific fields that SPDX or CDX defines or would this be strictly additional metadata provided by the syft user?

Hey Keith,

In our case these would be key value pairs. An example could be when you created the Sbom from a OCI Image there could be a field “registry” and “SHA” or something like “teamname”. I took a fast look into cyclonedx and there seems to be properties: Extended Use Case: Extensibility through CycloneDX Properties | CycloneDX that suit that use case. For syft I would love to have something similar but also that I can specify these via the cli when creating my sbom.

Thanks for your fast response!

@kzantow sorry to be that guy, but can i push this thread again? I would love this feature/property in the syft json

Not a problem! I still don’t quite understand what the best thing to do is here, but I would definitely like to find a solution for you.

On the surface this seems related to other requests that you would like to specify parts of a package for the primary “package” created for the SBOM. Syft has a slightly different data model here, where there is a Source that represents the thing you scanned, but in conversion to other formats: SPDX and CycloneDX, this gets converted to a package. I’m a proponent of having a package in the Syft data model for this, too. I think we’ve been moving in that direction with the --source-name and --source-version, and similar other requests. At least to me, it makes sense if these were just populating a Source package object. If we did add a package as a property of the source, there could be some type of package metadata, even if it was just a map of string → string, which would give a spot to store arbitrary properties. But maybe this is all unnecessary and just keeping things simple is the best option by adding a property map in the same or similar spot the source-name and source-verison.

One of the issues is that we are already using CDX properties to output syft data that can’t be expressed elsewhere. And SPDX 2.x does not have a spot for arbitrary properties (other than the comment field, which is often overloaded for this purpose). But maybe that’s not a problem if you’re using CDX format with this known limitation (or SPDX 3, once that is implemented). So we would need to make sure that whatever the solution we can output all the data in meaningful ways.

I think what we need here is a concrete proposal to move this forward. I think this would be a great topic for our weekly livestream!

Hey @henrysachs – I couldn’t find an existing issue to track this request, so I added one: Add arbitrary name-value pairs to SBOM · Issue #3734 · anchore/syft · GitHub, feel free to add any additional context or feedback there!

The general consensus is: this sounds like something we can move forward with, based on the general implementation details in the issue. I can’t say that we will be able to pick this one up in the near term, but we can always help shepherd a pull request.

And if I missed an existing issue about it, please do let me know so I can figure out how to clean up GH :slight_smile:

Hey Thanks for creating the issue. I liked the livestream and also thanks for the shoutout :smiley:
As said in the issue I would like to help with this, but also I think there are a lot of places in the code i havent seen already and would need to touch when implementing this so i would probably have a lot of questions :slight_smile: