Hi All,
Observed URLs in below format for maven components, is this valid ?, so user has to click on the link to know the license ?
{
"cpe": "cpe:2.3:a:byte-buddy:byte-buddy:1.14.11:*:*:*:*:*:*:*",
"name": "byte-buddy",
"purl": "pkg:maven/net.bytebuddy/byte-buddy@1.14.11",
"type": "library",
"group": "net.bytebuddy",
"bom-ref": "pkg:maven/net.bytebuddy/byte-buddy@1.14.11?package-id=da0d5c43d15755fc",
"version": "1.14.11",
"licenses": [
{
"license": {
"name": "https://www.apache.org/licenses/LICENSE-2.0.txt"
}
}
Thanks in advance, Anvitha
popey
September 2, 2024, 11:41am
2
Thanks for the question.
Looking at the Maven Repository entry for byte-buddy, I see this:
Further, on the upstream repo for byte-buddy I note the licenses section has the following:
<licenses>
<license>
<name>Apache License, Version 2.0</name>
<url>https://www.apache.org/licenses/LICENSE-2.0.txt</url>
<distribution>repo</distribution>
<comments>A business-friendly OSS license</comments>
</license>
</licenses>
So it looks like we’re pulling the URL from the package. I don’t know if that’s intentional or not, and will let a Syft developer speak to that.
1 Like
Can anyone from developement team please look into this issue
The same issue seems to be present in Syft JSON. I downloaded a few JARs into a test directory and ran the following command:
syft dir:. -o json | jq '.artifacts[] | { name: .name, licenses: .licenses }'
In order to get Syft’s raw JSON output on what licenses it found. Here’s the output:
{
"name": "byte-buddy",
"licenses": [
{
"value": "https://www.apache.org/licenses/LICENSE-2.0.txt",
"spdxExpression": "",
"type": "declared",
"urls": [],
"locations": [
{
"path": "/byte-buddy-1.14.11.jar",
"accessPath": "/byte-buddy-1.14.11.jar",
"annotations": {
"evidence": "primary"
}
}
]
}
]
}
{
"name": "commons-io",
"licenses": [
{
"value": "https://www.apache.org/licenses/LICENSE-2.0.txt",
"spdxExpression": "",
"type": "declared",
"urls": [],
"locations": [
{
"path": "/commons-io-2.16.1.jar",
"accessPath": "/commons-io-2.16.1.jar",
"annotations": {
"evidence": "primary"
}
}
]
}
]
}
{
"name": "rxjava",
"licenses": [
{
"value": "Apache-2.0",
"spdxExpression": "Apache-2.0",
"type": "concluded",
"urls": [],
"locations": [
{
"path": "/rxjava-3.1.9.jar",
"accessPath": "/rxjava-3.1.9.jar",
"annotations": {
"evidence": "primary"
}
}
]
}
]
}
It seems like the first two JARS, byte-buddy
and commons-io
both have a URL in the value
field, whereas rxjava
looks like it has Apache-2.0
in the value field, which seems more correct.
There’s quite a bit of variation in how different JARs represent metadata like licenses internally; it seems likely that we’re mixing up URL and Value fields on some JARs. We will investigate and see if we can improve Syft’s behavior here.
1 Like
I think I’ve found what’s different about these JARs.
In byte-buddy
and in commons-io
, we see the following line in the meta-inf/manifest.mf
file:
Bundle-License: https://www.apache.org/licenses/LICENSE-2.0.txt
But Syft expects that field to be the license name .
In rxjava
, we see that there is no such field in the manifest. Because there is no Bundle-License
field in the manifest (unzip -p rxjava-3.1.9.jar META-INF/MANIFEST.MF
doesn’t print any Bundle-License
), so Syft falls back to parsing the license from META-INF/LICENSE
, which is a copy of the Apache 2.0 license and Syft parses it correctly.
Since this is happening for a few JARs, I’ll open an issue to enhance this parsing.
2 Likes
popey
September 3, 2024, 1:41pm
7
Thanks @willmurphy !
I’ll close this thread here, so conversation can carry on in the issue, now we have one.
Thank you for reporting @anvitha_haviligi
1 Like