Associating security metadata with multi-architecture container images

Written by Rob Best

‚Äč

			Associating security metadata with multi-architecture container images

Published on our Cloud Native Blog.
Tagged with

Jetstack have been working with a number of customers, building container assessment pipelines that enable them to understand the security profile of the images in use within their organisation. During these engagements, one thing we’ve had to be consistently mindful of is how we handle multi-architecture images.

Multi-arch images allow you to use one image reference to run the same container on different CPU architectures. This simplifies configuration but obfuscates the fact that multi-arch images are simply a collection of distinct images, each with the potential to have their own set of components, vulnerabilities and licenses.

If you aren’t careful, you can end up associating vulnerability reports and other security-related information with the wrong platform.

This post explores this in more detail and suggests ways that you can appropriately assign metadata to multi-arch images.

How multi-arch images work

First, let’s see how multi-architecture images work under the hood.

Each individual container image in a registry is represented by a manifest, which is a JSON document that describes the configuration for running the container and its layers.

For instance, here’s the manifest for cert-manager-controller’s linux/amd64 image.

$ crane manifest quay.io/jetstack/cert-manager-controller:v1.9.1 --platform=linux/amd64
{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
   "config": {
      "mediaType": "application/vnd.docker.container.image.v1+json",
      "size": 2490,
      "digest": "sha256:8eaca4249b016e1e355957d357a39a0a8a837e1837054e8762fe7d1cd13051af"
   },
   "layers": [
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 804101,
         "digest": "sha256:b9f88661235d25835ef747dab426861d51c4e9923b92623d422d7ac58eb123e9"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 16000883,
         "digest": "sha256:5d106a2629a861d93b8b5111e6735594deead00d048e2215bd80cc4784ae0d9e"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 631,
         "digest": "sha256:3037f5ce7599ba6fb57a17ff4aefca8cbbeb2a2ecc8363b24fdedc1fca74ccb7"
      },
      {
         "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
         "size": 269,
         "digest": "sha256:02efe101fd3a9234df2c84bfa72a7b1b71babdbc9ab16f6df7fada1487433f20"
      }
   ]
}

A multi-arch image, on the other hand, is represented by a ‘manifest list’, a different type of JSON document that lists individual manifests and the platforms they support.

$ crane manifest quay.io/jetstack/cert-manager-controller:v1.9.1
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
  "manifests": [
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 1153,
      "digest": "sha256:81a5e25e2ecf63b96d6a0be28348d08a3055ea75793373109036977c24e34cf0",
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 1153,
      "digest": "sha256:573f7fb6fe3f32195d919b023bbf56634c22ecce5b606d5af3104ef41565b9bb",
      "platform": {
        "architecture": "arm",
        "os": "linux",
        "variant": "v7"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 1153,
      "digest": "sha256:63feade2625bd65ce615f6459b5cddecd0d251c826746bf0ed1a63d0e869eec3",
      "platform": {
        "architecture": "arm64",
        "os": "linux",
        "variant": "v8"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 1153,
      "digest": "sha256:cbca740b7747b942967f6e994627959cc2056166444d355c7af79e707819a76f",
      "platform": {
        "architecture": "ppc64le",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 1153,
      "digest": "sha256:e078a804d57c92e80fd7635b7eea0f80cffec1d207d7a3fe834830de953d288d",
      "platform": {
        "architecture": "s390x",
        "os": "linux"
      }
    }
  ]
}

Generally, friendly tags like quay.io/jetstack/cert-manager-controller:v1.9.1 are associated with a manifest list. When your docker client pulls this tag, it uses the manifest list to select the right manifest for your current platform and then uses that manifest to figure out which layers to pull.

This makes it possible to use the exact same reference across platforms, while still receiving the image specific to the platform you’re currently running on.

Manifests in a manifest list are not the same

It’s important to reiterate that manifests in a manifest list are completely distinct from each other. Usually a manifest list contains platform-specific variants of the same piece of software. However, there’s nothing that guarantees that.

It’s perfectly possible to construct a multi-arch image, for instance, where one manifest gives you a version of prom/prometheus and another gives you elasticsearch.

Even if the manifests in a manifest list are related, each platform-specific image can have its own set of dependencies, vulnerabilities and licenses.

You need to be aware of this when implementing supply chain security tooling. If you aren’t careful, your assessment of an image for one platform can be incorrectly applied to another.

Example

Here’s a concrete example.

In general, when security scanning tools target multi-arch images they will do one of a few things:

  • Resolve the image to the platform the scanner is running on
  • Target the image provided to them by the Docker daemon
  • Default to the most common platform, linux/amd64

You need to account for this when you’re putting pipelines together. It’s an easy mistake to make to scan an image on one platform and then use the results to green light deployment to another.

For instance, take this example script, which we will assume runs on a linux/amd64 host.

# Build a multi-arch image
docker buildx build \
  --push \
  --platform=linux/amd64,linux/arm64 \
  --tag repository.example.com/myimage:tag

# Use trivy to scan it for vulnerabilities and produce a report in the format of
# an in-toto attestation
trivy i -f cosign-vuln -o vuln.json repository.example.com/myimage:tag

# Attach the vulnerability attestation to the image with cosign
COSIGN_EXPERIMENTAL=1 cosign attest --type vuln --predicate vuln.json repository.example.com/myimage:tag

Here we’re building a multi-arch image for the linux/amd64 and linux/arm64 platforms, but when trivy scans it, it resolves the image to the platform it’s running on (linux/amd64) and therefore the vulnerability report it produces only strictly applies to that particular platform.

When we attest the image, we attach the attestation to the manifest list. This is a problem because now we have a vulnerability report for one image associated, implicitly, with all the images in the list.

When someone running on linux/arm64 runs a cosign download attestation on repository.example.com/myimage:tag they’re going to get the report for linux/amd64 back. This could be missing vulnerabilities that affect the linux/arm64 image, or include false positives that don’t apply to it.

Fixing the example

You could improve the example by scanning each platform individually.

REPOSITORY="repository.example.com/myimage"
TAG="tag"
PLATFORMS="linux/amd64,linux/arm64"

# Build a multi-arch image
docker buildx build \
  --push \
  --platform="${PLATFORMS}" \
  --tag "${REPOSITORY}:${TAG}"

# Iterate through each platform
for platform in ${PLATFORMS//,/ }; do
  # Resolve the digest of the platform-specific manifest
  digest=$(crane digest "${REPOSITORY}:${TAG}" --platform="${platform}")

  # Use trivy to scan the platform-specifc image for vulnerabilities and
  # produce a report in the format of an in-toto attestation
  trivy i -f cosign-vuln -o "${platform}-vuln.json" "${REPOSITORY}@${digest}"

  # Attach the vulnerability attestation to the manifest with cosign
  COSIGN_EXPERIMENTAL=1 cosign attest \
    --type vuln \
    --predicate "${platform}-vuln.json" \
    "${REPOSITORY}@${digest}"
done

Now we have a vulnerability attestation attached to each individual manifest, that applies specifically to that manifest.

You can download the attestation by resolving the image to your target platform and then running cosign download attestation.

DIGEST=$(crane digest repository.example.com/myimage:tag --platform=linux/arm64)

COSIGN_EXPERIMENTAL=1 cosign download attestation repository.example.com/[email protected]$DIGEST

This is less user friendly than it could be. Ideally cosign would support the option to drill down to your intended platform, without requiring crane.

Other kinds of metadata

Vulnerability reports are generally platform-specific. However, some metadata will apply equally to all images in a manifest list.

One example would be a SLSA provenance attestation. Provided the manifest list and the images were all generated from the same source by the same build command and builder, it would be acceptable to attach the provenance to the manifest list.

This doesn’t, however, apply to Software Bills of Materials (SBOMs) which can vary between platform-specific versions of an image. The image for ko provides a good example of how to account for that.

If you retrieve the SBOM from the manifest list, the packages field only lists the child manifests.

$ cosign download sbom ghcr.io/google/ko | jq -r '.packages[] | {name: .name, versionInfo: .versionInfo}'
WARNING: Downloading SBOMs this way does not ensure its authenticity. If you want to ensure a tamper-proof SBOM, download it using 'cosign download attestation <image uri>' or verify its signature.
Found SBOM of media type: spdx+json
{
  "name": "sha256:d69330694894faf2662eedc96a1cc64e7c646d78998041d54233523b90803c42",
  "versionInfo": null
}
{
  "name": "sha256:79a14f9ae373dca89c3a5111a9a327ea19c575b010a7f74180ad813db0054b36",
  "versionInfo": "linux/amd64"
}
{
  "name": "sha256:f7c5d3efe24c70b97d7a526c2a33d98a57736168a0f4141d65589b945017c8ba",
  "versionInfo": "linux/arm/v5"
}
...

But the SBOM for a specific manifest lists the specific dependencies for that platform.

$ cosign download sbom ghcr.io/google/ko --platform=linux/amd64 | jq -r '.packages[] | {name: .name, versionInfo: .versionInfo}'
WARNING: Downloading SBOMs this way does not ensure its authenticity. If you want to ensure a tamper-proof SBOM, download it using 'cosign download attestation <image uri>' or verify its signature.
Found SBOM of media type: spdx+json
{
  "name": "sha256:79a14f9ae373dca89c3a5111a9a327ea19c575b010a7f74180ad813db0054b36",
  "versionInfo": null
}
{
  "name": "index.docker.io/library/[email protected]:0afb9ba662931ec66efabb2a5edc1af559964dc8a3ce653f7936e1bf91b6b606",
  "versionInfo": "index.docker.io/library/golang:1.18"
}
{
  "name": "github.com/google/ko",
  "versionInfo": null
}
{
  "name": "cloud.google.com/go/compute",
  "versionInfo": "v1.7.0"
}
{
  "name": "github.com/Azure/azure-sdk-for-go",
  "versionInfo": "v66.0.0+incompatible"
}
{
  "name": "github.com/Azure/go-autorest/autorest",
  "versionInfo": "v0.11.28"
}
...

This is good because each SBOM is an accurate reflection of the object it’s attached to. There’s less chance of mistakenly associating a dependency on one platform with another.

The image for ko is built by ko itself. So if you use ko to build your Go containers then you should get the same.

Takeaways

Multi-arch images are implemented explicitly so that users don’t need to think too hard about the platforms they’re running on. This is user friendly but can get you into trouble in the context of security.

When building multi-arch images, ensure that any metadata associated with a manifest list applies equally to all the manifests it describes. Otherwise, metadata should probably be associated individually with platform-specific manifests.

When consuming multi-arch images, ensure that you’re assessing the correct images for the platforms you’re deploying to.

Get in touch

If you need help improving the overall security posture of your organisation’s container image pipelines, Jetstack can help. We have deep expertise and experience working with clients across all areas of software supply chain security.

Contact us directly to discuss how we can help.

Get started with Jetstack

Enquire about Subscription

Contact us