hello!
I would like to know if there is a way to identify base layers given a .tar file (in general, and if this is planned to be developed in syft in the future).
thanks!
hello!
I would like to know if there is a way to identify base layers given a .tar file (in general, and if this is planned to be developed in syft in the future).
thanks!
I want to make sure I understand the question.
It sounds like you have a .tar
file that was made by docker save my-image > some-file.tar
or a similar process? That is, you have a .tar
file that represents an OCI image?
By base layer, I think you mean you have a Dockerfile like this:
FROM alpine:latest
RUN apk add ...
COPY . /app
...
And you want to distinguish between things that were added by FROM alpine:latest
and things that were added after.
Have I understood the request correctly?
hi @willmurphy
correct
I’m trying to identify packages that were taken from the base layer
however, maybe as first phase maybe just mark the base layers in the field on “layers” in the SBOM
We talked about this question at about minute 52 on https://www.youtube.com/live/9l-3UV9wjLk?t=3109s
thanks! @willmurphy
so i’m guessing currently syft needs to scan twice to find which package added in a given layer, but still I don’t understand how we can identify this is the base layer.
I think the challenging thing here is that the “base layer” is not necessarily one layer. Let’s look at a specific example:
FROM redhat/ubi9-micro:9.5-source
COPY ./foo /foo
Build it and save it as a tar:
touch foo
docker build . -t localhost:297-rhel
docker save localhost:297-rhel > rhel-foo.tar
List the parts of the tar with tar -tf rhel-foo.tar
, prints:
blobs/
blobs/sha256/
blobs/sha256/015b865d4c23630e28c279fb5a36a539aa908bb48826ecf1e3f98f72d020e4cf
blobs/sha256/03412c3948d5b150166455b3406ca2d2b29ca0ed993acc2a7b0882161531c2bf
blobs/sha256/1297ea5d87fe99cc585f9390e38c51f5b3d2d349e9d5a08cc59978a8cb08c247
blobs/sha256/146a5b5b63de10bfe6fd4aa999a9ac4a5d903c04834249d08caf07d7b802422a
... snip ...
index.json
manifest.json
oci-layout
repositories
As you can see, there are lots of layers. But we only added one file! What’s going on? It turns out that the FROM
line is just bringing in every layer from redhat/ubi9-micro:9.5-source
, so only the last layer is ours. But how could I know that without the dockerfile? All the layers in the OCI tar look the same.
So the answer to your question is: This hasn’t been built in Syft yet, because there’s no good way to know which layers the user considers part of their “base image” and which ones they don’t, because the FROM
statement can add any number of layers, and the user can add any number of layers, and the OCI image itself doesn’t know which layer came from where.