Request for discussion: logging levels and when to use them

Syft, Grype, Grant, and other tools have several log levels available: error, warning, info, debug, and trace. The goal of this discussion is to help everyone (contributors, maintainers, people reading logs) know what is meant by each logging level and when to use them.

Also, when we’ve gotten feedback about logs, it’s usually that the logs are too noisy. There are also lots of places in the code where a given function gets called 1000s of times (e.g. once per file in the image, or once per file matching a specific glob in the image), and it’s especially important not to spam logs in those hot paths.

So to get to the point, what do the log levels mean:

  1. error - the requested action cannot be completed. For example, you asked Syft to scan a directory but there’s not a directory at that path. Usually the program should end.
  2. warn - the operation can continue, but the results will likely be wrong. You ran Grype on an image, but Grype doesn’t recognize the distro, so OS packages can’t be compared to the distro vendor’s vulnerability data, so Grype will probably miss things.
  3. info - useful information that doesn’t affect correctness.
  4. debug - useful for debugging - this would be helpful to someone trying to figure out why some data was / wasn’t included in the SBOM, like, “there’s no Group ID in these pom properties, so it’s time to start looking in the manifest file.”
  5. trace - trace the execution of the program, for even finer grained debugging.

What do you all think? The above is meant to describe an ideal state, rather than what’s done today. (Fun fact, as of this writing there are only 2 log.Info calls in Syft.)

Here’s an interesting issue about logging that people might read as part of the discussion: Display warnings even when `-v` is not passed and no tty is present · Issue #2180 · anchore/grype · GitHub

I think grype does a lot of log.Info calls, while Syft doesn’t, which is interesting.

Som initial thoughts

  • error is meant to capture truly fatal actions, take note that error instance types returned from failing functions all the way to the main will probably be logging errors (thus low level functions don’t need to use the error level regularly).
  • we shouldn’t warn if there isn’t an action the user can take to correct what the message is referring to