In syft generated sbom we see license as links https://www.apache.org/licenses/LICENSE-2.0.txt

Hi All,

Observed URLs in below format for maven components, is this valid ?, so user has to click on the link to know the license ?

{
      "cpe": "cpe:2.3:a:byte-buddy:byte-buddy:1.14.11:*:*:*:*:*:*:*",
      "name": "byte-buddy",
      "purl": "pkg:maven/net.bytebuddy/byte-buddy@1.14.11",
      "type": "library",
      "group": "net.bytebuddy",
      "bom-ref": "pkg:maven/net.bytebuddy/byte-buddy@1.14.11?package-id=da0d5c43d15755fc",
      "version": "1.14.11",
      "licenses": [
        {
          "license": {
            "name": "https://www.apache.org/licenses/LICENSE-2.0.txt"
          }
        }

Thanks in advance, Anvitha

Thanks for the question.

Looking at the Maven Repository entry for byte-buddy, I see this:

Further, on the upstream repo for byte-buddy I note the licenses section has the following:

    <licenses>
        <license>
            <name>Apache License, Version 2.0</name>
            <url>https://www.apache.org/licenses/LICENSE-2.0.txt</url>
            <distribution>repo</distribution>
            <comments>A business-friendly OSS license</comments>
        </license>
    </licenses>

So it looks like we’re pulling the URL from the package. I don’t know if that’s intentional or not, and will let a Syft developer speak to that.

1 Like

Can anyone from developement team please look into this issue

The same issue seems to be present in Syft JSON. I downloaded a few JARs into a test directory and ran the following command:

syft dir:. -o json | jq '.artifacts[] | { name: .name, licenses: .licenses }'

In order to get Syft’s raw JSON output on what licenses it found. Here’s the output:

{
  "name": "byte-buddy",
  "licenses": [
    {
      "value": "https://www.apache.org/licenses/LICENSE-2.0.txt",
      "spdxExpression": "",
      "type": "declared",
      "urls": [],
      "locations": [
        {
          "path": "/byte-buddy-1.14.11.jar",
          "accessPath": "/byte-buddy-1.14.11.jar",
          "annotations": {
            "evidence": "primary"
          }
        }
      ]
    }
  ]
}
{
  "name": "commons-io",
  "licenses": [
    {
      "value": "https://www.apache.org/licenses/LICENSE-2.0.txt",
      "spdxExpression": "",
      "type": "declared",
      "urls": [],
      "locations": [
        {
          "path": "/commons-io-2.16.1.jar",
          "accessPath": "/commons-io-2.16.1.jar",
          "annotations": {
            "evidence": "primary"
          }
        }
      ]
    }
  ]
}
{
  "name": "rxjava",
  "licenses": [
    {
      "value": "Apache-2.0",
      "spdxExpression": "Apache-2.0",
      "type": "concluded",
      "urls": [],
      "locations": [
        {
          "path": "/rxjava-3.1.9.jar",
          "accessPath": "/rxjava-3.1.9.jar",
          "annotations": {
            "evidence": "primary"
          }
        }
      ]
    }
  ]
}

It seems like the first two JARS, byte-buddy and commons-io both have a URL in the value field, whereas rxjava looks like it has Apache-2.0 in the value field, which seems more correct.

There’s quite a bit of variation in how different JARs represent metadata like licenses internally; it seems likely that we’re mixing up URL and Value fields on some JARs. We will investigate and see if we can improve Syft’s behavior here.

1 Like

I think I’ve found what’s different about these JARs.

In byte-buddy and in commons-io, we see the following line in the meta-inf/manifest.mf file:

Bundle-License: https://www.apache.org/licenses/LICENSE-2.0.txt

But Syft expects that field to be the license name.

In rxjava, we see that there is no such field in the manifest. Because there is no Bundle-License field in the manifest (unzip -p rxjava-3.1.9.jar META-INF/MANIFEST.MF doesn’t print any Bundle-License), so Syft falls back to parsing the license from META-INF/LICENSE, which is a copy of the Apache 2.0 license and Syft parses it correctly.

Since this is happening for a few JARs, I’ll open an issue to enhance this parsing.

2 Likes

Hi @anvitha_haviligi I’ve filed a bug report for this: Syft sometimes reports URL for license value when scanning JARs with a URL in `Bundle-License` field of manifest · Issue #3186 · anchore/syft · GitHub

You can use that GitHub issue to see when this is resolved.

2 Likes

Thanks @willmurphy !

I’ll close this thread here, so conversation can carry on in the issue, now we have one.

Thank you for reporting @anvitha_haviligi

1 Like