Grype scans in batches large SBOMs

I’ve raised a feature request which I think worth discussing:

I think in cases of large SBOMs with lots of CVEs we may consider a way to scan it in batches and save it into the disk, so we will not need a lot of memory.

what do you think?

Is there any indication what is using memory? You can run Grype in profiling mode using the GRYPE_DEV_PROFILE=mem env var, I think, and get a pprof file that will provide some insight about where allocations happen, etc… I don’t really see a lot in Grype that should inherently be using large amounts of memory if it’s reading an SBOM and not performing a Syft scan.

I’ll try and send you the relevant files.
I saw this issue - Command terminated by signal 9 due to OOM (Out of Memory) · Issue #1509 · anchore/grype · GitHub
and it seems related (but it closed)
can you please re-open it? I think I’m having a similar issue when using grype in a task

i’ve tried to run grype on an sbom (which i can’t share) on an ec2 (with the profile configuration):
we scan sbom file
command:
grype sbom:sbom.json -c config.yaml -o json

1st time:
16GB 4CPU → exit code 137

2nd time:
32GB 8CPU → success

here is the file.

BTW the env flag doesn’t work, only when I put it inside the config file.

more information:
image

the grype result size is 1.8gb
the sbom size is 140mb~

can’t share the sbom, but i wonder if a can filter in runtime the results (maybe only high \ critical, maybe can remove duplicate CVEs)

this is example of top packages and number of vulnerabilities:
3971 linux-modules-5.3.0-1017-aws
3971 linux-image-5.3.0-1017-aws
3971 linux-aws-5.3-headers-5.3.0-1017
3960 linux-modules-5.3.0-1019-aws
3960 linux-image-5.3.0-1019-aws
3960 linux-aws-5.3-headers-5.3.0-1019
3956 linux-modules-5.3.0-1028-aws
3956 linux-modules-5.3.0-1023-aws
3956 linux-image-5.3.0-1028-aws
3956 linux-image-5.3.0-1023-aws

I’m afraid that even after this pr will be merged - fix upstream match for linux-.*-headers-.* by barnuri · Pull Request #2320 · anchore/grype · GitHub

the file will be big, since we include the ignored matches.
so I have few suggestions here -

  1. add the option to filter out ignored matches to reduce the file size
  2. don’t include the ignore vulnerabilities in the memory, since it can be thousands of results.
  3. adding complex ignores so we will not need to merge to grype (and each user can make his own ignores)
  4. filter out vulnerabilities by severity (in runtime)