For a long time Syft was purely a static analysis tool and did not fetch data from the network to determine package information to be included. This has been changing over time, and as of today Syft is able to search for licenses from remote sources for JavaScript, Golang, and Maven along with the ability to resolve additional required information from these online sources. Because of this, we all thought it was a good time to circle back to the original idea of having a multi-level configuration to enable all features together with a single flag.
With our CLI tools, we generally try to make easy-to comprehend flags with as little ambiguity as possible and keep the terms short, only deviating from these tenets when necessary.
I created a PR to add a --use-network
flag, but after some reflection and after starting to port the same behavior to Grype, I realized that network strictly for cataloging is a bit wrong, since there are other network features such as checking for application update or downloading a grype database. So, should these functions also fall under the same flag? And what about customizing the Syft scan based on what a user is most interested in: let’s say the user scans a lot of Java AND Javascript, but only cares about enabling Maven lookups to improve the Java quality and doesn’t want the performance penalty of looking up remote licenses for the Javascript packages?
After some team discussion, we narrowed down the choices if we were to use a boolean flag to: --network
, that also applies to other network features or --remote-enrichment
to apply just to the network features that search remote sources to enrich the package data with additional information not present such as licenses.
But neither of these boolean-only options allow for specifying individual elements, those would still have to be specified with environment variables or config files, with things like SYFT_GOLANG_SEARCH_REMOTE_LICENSES=false
to disable a single specific thing a user might want to disable.
Another approach is to follow the pattern we established with the catalogers to some degree: allowing multiple --network
flags to be specified, with some modifier directives in the case a user wants to disable these, for example:
syft --network
could mean: enable all network features, or:
syft --network=all,-golang
could mean: enable all network features, except golang. This has some drawbacks, like these don’t necessarily match cataloger names and would sort-of ad-hoc match the names we’ve used for the cataloger categories of the same types, and we might want to support aliases, like golang
and go
.
And a final example:
grype --network=none,db
could mean: do no network operations except related to the database.
A final caveat: it looks like the state of the current flags system would allow us to default a value to a []string
, such as all
if there is no additional information provided – in other words: syft --network
could be made to enable all network features. HOWEVER, this does not let flags be specified without the =
, so syft --network all
does NOT work, whereas syft --network=all
DOES work.
So the questions are:
- is allowing multiple too complicated?
- do we care much about specifying individual network features or disabling all network activity to require an
=
, unlike most of our other flags? - would requiring a second term be okay? e.g.
--network all
,--network on
, or--network enable
- should the flag not apply to general network operations, but only data enrichment?
- other thoughts?