Improvements to scanning whole machine

This thread is for discussing ways of making Syft better at scanning file systems that represent a whole machine, not just a source repo or a container image. (For example, SSHing into a VM and running syft /.)

This has come up a few times, for example:

I think there are a few aspects to consider here:

  1. Old assumptions: when Syft scans an image, it looks for evidence that packages have been installed. When it scans a directory, it looks for evidence that dependency on a package has been declared. However, when scanning a VM, it probably makes sense to look for installed dependencies instead of or in addition to declared dependencies.
  2. Performance: building an in-memory index of a small docker container is fine, but a whole VM file system might need a prohibitive amount of memory to index. Also, right now, the entire SBOM is in memory at the same time, leading to the same problem.

This thread exists to discuss the whole question of how to make Syft and Grype work as well when pointed at a computer’s whole filesystem as they do when pointed at an image or source repo, rather than discussing small aspects of the performance or config in individual issues.

1 Like

this thread - Grype scans in batches large SBOMs - #5 by TimBrown1611

is also related to improving scanning VMs :slight_smile:

1 Like