I recently tried to scan EC2 machines using grype, and received lots of vulnerabilities related to python.
some of them were found on packages related to python 2.7 that existed in the machine (as is, without any further action of a user).
I wonder if grype knows how to identify if a package was installed via yum or pip, or if it has any option to filter packages that weren’t manually downloaded by the user itself (something like base layer in image).
Since a user can’t actually fix a base distribution of an EC2.
Hi @TimBrown1611,
Thanks for the question!
Syft reports this information - syft packages, if you look at the -o json
output, will report that a given package is RPM, or PyPi, or whatever. You could configure Syft to turn of certain catalogers, e.g. the RPM cataloger, and then use syft -o json | grype
to get a scan for the remaining packages. You can read more in the docs on package cataloger selection.
You could also configure Grype to ignore certain packages or certain versions of certain packages. The Grype README has a section on how to do this.
Can you tell me a bit about your use case? Are you worried about vulnerabilities in an application you’re deploying to EC2? Or are end users installing things on EC2-based workstations after the initial AMI is deployed and you want to keep track of vulnerabilities in there? I ask because, if you’re deploying an application, it might make sense to run Grype against the application you’re about to deploy rather than the whole instance.
Thanks for the answer.
If you launch an EC2 and scan it with grype (without installing anything), you will find out it has vulnerabilities from python2.7 (pillow and pyyaml packages) for example. I assume a regular user can’t do much with this kind of information, since it is a base AMI.
I would like to know if it is possible to separate between AMI vulnerabilities and the other, like it planned to do with images base layer in the future.
Right now, Syft does not have a good way to say that package in the base AMI don’t count. We have Compare two SBOMs · Issue #292 · anchore/syft · GitHub, which is a request to allow syft to compare two SBOMs. Then you could scan a base VM with nothing installed after the AMI, and then scan another VM, and substract the two SBOMs. Then you could pass the resulting SBOM, which represented only the packages in the VM in question that weren’t in the AMI, into Grype and do a scan normally.
I have a couple of thoughts here.
First, whether these packages are a security concern isn’t really affected by whether you have a good way to remove them. The packages are actually present on the instance, and depending on what is happening on the host, vulnerabilities in those packages could be exploited.
Second, this is one of the reasons people like to deploy applications in a container, e.g. a Docker image: A working Ubuntu or Amazon Linux instance that you can log into and use has a ton of packages that, for example, a node web app just doesn’t need at all. You can shrink the set of packages to scan and vulnerabilities to patch by putting the application in a Docker image and scanning that. (Really, a Docker image is sort of like what you’re asking for: A bundle of only those dependencies you need to add to a plain Linux host.)
Third, you could research making custom AMIs, though I admit I don’t know much about that.
Fourth, if you really want to ignore everything in the base image, you could write a script that basically does syft -o json dir:/ > /some/path/sbom.json
on a plain EC2 instance (nothing after AMI installed) and then parses the resulting JSON file to get a list of those packages you don’t care about, and writes that to a grype config so that all vulnerabilities affecting those files are ignored. I think this is basically what you’re asking for. The script might be a bit of work, but the main reason not to do this is the first point above: the packages are still there, and still vulnerable. Attackers don’t care whether they “don’t count” because it’s hard to change the base AMI.
Sorry if this answer is really long and not very helpful - I’ve been thinking about this problem for a while. I think what I would do in your situation is bundle your app in a Docker image, scan the Docker image with Grype, and then run the Docker container on your hosting provider of choice.
All this is assuming that the EC2 instance is running an application and not a workstation. If it’s running a workstation, my answer might be a little different.