As per the code, i understood that version is parsed from Manifest file and Implementation-Version parameter is available in the manifest file. Still, syft is trying to parse version from jar file name. Is this normal?
I have downloaded jar from here, renamed the jar file name.
@kzantow can provide better details than I can, but in general, identifying JARs is challenging because Java doesn’t have a single, widely-used standard way of encoding metadata like package name and version into a JAR file.
Right now, Syft tries to get info about a JAR from the following places:
I think what you’re seeing on that JAR is that we’re falling back to the filename. Based on a little poking around, you might have found a bug where we’re skipping a manifest value we shouldn’t be.
@kzantow it seems incorrect that selectName checks the filepath before checking the manifest:
Syft attempts to get the identity of a JAR file in various ways, as @willmurphy noted above – the selectName is what you would want to look at.
When I’ve looked at this in the past I’ve definitely found it a bit convoluted and hard to follow. I wonder if it would be helpful to somehow write the identification rules in English or a decision tree, refine as much as necessary, and make sure that the behavior matches. An example is: if we find a pom.xml, what do we do? what about with or without a pom.properties or when we find multiple pom.xml files? IIRC, we don’t use information from a pom.xml if there’s no corresponding pom.properties. But is that right? I don’t really think so, but how do we describe exactly what to do across all the different JARs we find in the wild?
We have a specific test case that implementation title does not override filename:
I don’t understand why yet, but I will keep looking. @kzantow do you know why? This check seems backwards to me, but the test case has been in place for years.
I know the manifest file is incredibly unreliable as there is no real consistency on how it is used between projects. We do have some specific logic for Apache Maven Bundle Plugins because there was a documented spec that those projects seemed to follow, but otherwise it can be very inconsistent and I suspect if you were to just change this behaviour now you’d introduce many false positives that were previously prevented
At a minimum you’d need to re-run the testing against all of the latest maven central artifacts to see what gets better/worse. I think @Christopher_Phillips had some scripts for this once when we were changing some purl generation and package deduplication behaviour