Grype scans in batches large SBOMs

TimBrown1611 · January 1, 2025, 10:30am

I’ve raised a feature request which I think worth discussing:

I think in cases of large SBOMs with lots of CVEs we may consider a way to scan it in batches and save it into the disk, so we will not need a lot of memory.

what do you think?

kzantow · January 1, 2025, 6:42pm

Is there any indication what is using memory? You can run Grype in profiling mode using the GRYPE_DEV_PROFILE=mem env var, I think, and get a pprof file that will provide some insight about where allocations happen, etc… I don’t really see a lot in Grype that should inherently be using large amounts of memory if it’s reading an SBOM and not performing a Syft scan.

TimBrown1611 · January 2, 2025, 12:16pm

I’ll try and send you the relevant files.
I saw this issue - Command terminated by signal 9 due to OOM (Out of Memory) · Issue #1509 · anchore/grype · GitHub
and it seems related (but it closed)
can you please re-open it? I think I’m having a similar issue when using grype in a task

TimBrown1611 · January 2, 2025, 2:54pm

i’ve tried to run grype on an sbom (which i can’t share) on an ec2 (with the profile configuration):
we scan sbom file
command:
grype sbom:sbom.json -c config.yaml -o json

1st time:
16GB 4CPU → exit code 137

2nd time:
32GB 8CPU → success

here is the file.

BTW the env flag doesn’t work, only when I put it inside the config file.

TimBrown1611 · January 6, 2025, 11:45am

more information:

the grype result size is 1.8gb
the sbom size is 140mb~

can’t share the sbom, but i wonder if a can filter in runtime the results (maybe only high \ critical, maybe can remove duplicate CVEs)

this is example of top packages and number of vulnerabilities:
3971 linux-modules-5.3.0-1017-aws
3971 linux-image-5.3.0-1017-aws
3971 linux-aws-5.3-headers-5.3.0-1017
3960 linux-modules-5.3.0-1019-aws
3960 linux-image-5.3.0-1019-aws
3960 linux-aws-5.3-headers-5.3.0-1019
3956 linux-modules-5.3.0-1028-aws
3956 linux-modules-5.3.0-1023-aws
3956 linux-image-5.3.0-1028-aws
3956 linux-image-5.3.0-1023-aws

I’m afraid that even after this pr will be merged - fix upstream match for linux-.*-headers-.* by barnuri · Pull Request #2320 · anchore/grype · GitHub

the file will be big, since we include the ignored matches.
so I have few suggestions here -

add the option to filter out ignored matches to reduce the file size
don’t include the ignore vulnerabilities in the memory, since it can be thousands of results.
adding complex ignores so we will not need to merge to grype (and each user can make his own ignores)
filter out vulnerabilities by severity (in runtime)

TimBrown1611 · January 26, 2025, 1:24pm

According to the community chat,
I’ve added the file here - scan a large sbom in batches · Issue #2357 · anchore/grype · GitHub

@wagoodman @kzantow

kzantow · January 26, 2025, 5:56pm

Thanks @TimBrown1611!

A quick look at this shows that… the Grype scan uses approximately 1GB and the JSON writing is using over 3 GB!

I don’t think batching the results will help here, since they all need to be aggregated to the final JSON report.

… and if we look, it’s using almost 1 GB just for indenting?

First suggestion is to just disable pretty JSON somewhere… I thought that was the default: compact JSON, but alas it is not. It would be interesting to see if disabling indent helps this noticeably by updating this to:

enc.SetIndent("", "")

TimBrown1611 · January 26, 2025, 6:55pm

thanks for your response @kzantow !
is it configurable?

moreover, please notice the diffs between the sbom and grype results size, in a quick look, it contains lots of duplicates CVEs which in my opinion can be removed.

kzantow · January 26, 2025, 6:58pm

It is not currently configurable. We made this the default and configurable in Syft, but it looks like not in Grype. Someone will need to port that change over to Grype.

There are lots of things we can do to improve memory usage and other efficiencies like deduplicating results, but it’s important to understand where the problems are first so we can spend our time adjusting the tools as appropriate.

TimBrown1611 · January 26, 2025, 9:43pm

ok, I can try and open a PR in grype if a similar behave is already implemented in syft

kzantow · January 26, 2025, 11:37pm

FYI I’m referring to the “pretty” option: syft/cmd/syft/internal/options/format.go at main · anchore/syft · GitHub

TimBrown1611 · January 27, 2025, 10:31am

hi!
please look at -

it will be really helpful

kzantow · January 27, 2025, 2:27pm

Did you run Grype with this change and profiling enabled to get a pprof file, to see if it actually uses less memory?

TimBrown1611 · January 27, 2025, 2:55pm

couldn’t check the memory, but can say the file size was reduced by 30%

kzantow · January 27, 2025, 6:06pm

What were the original and new file sizes? Was this 1.8gb → 1.26gb? That seems good, though doesn’t align with the memory usage directly. Getting the memory profile will help understand if that change did any good.

TimBrown1611 · January 28, 2025, 9:29am

hi @kzantow !

added to the github issue 2 files, one with the fix and one without.
the final size of output was reduced 15%.

please let me know if any other details are needed

kzantow · January 28, 2025, 3:11pm

It looks like it was a red herring – it does hot appear to have helped overall memory usage:

If I’m reading this right: json.Present is still using 4 GB.

But, printing JSON in compact form is still a good change to reduce file sizes, I’ll leave more feedback on the PR.

TimBrown1611 · January 28, 2025, 3:15pm

ok, sounds good.
however, I still facing the same issues. Unfortunately I can’t share the file results of grype & SBOM, but if you scan any default AWS machine you will see the duplicates CVEs which makes the results very large (and in my opinion - without any additional value).
So I would like to think about other solution to reduce the memory

Topic		Replies	Views
Show each CVE only once i final results Grype	3	23	January 23, 2025
Build grype as a dll Grype	1	12	January 2, 2025
Question about relationship and the impact on grype results Grype	4	45	November 27, 2024
Seeking short form video ideas General	8	125	June 6, 2024
May 1st \| Open Source Gardening \| Live Stream Announcements video , gardening	1	20	May 2, 2025

Grype scans in batches large SBOMs

Related topics