Skip to content

Slim down product images #816

Open
Open
@dervoeti

Description

@dervoeti

We might have some potential to slim down product images. This can reduce build time, image size and attack surface. For example, the Hive Dockerfile has a comment about Hadoop:

# TODO: Do we really need all of Hadoop in here?

Now that we build from source, it might be worth digging into the build processes to:
a) Limit which components we build. It doesn't make sense to build stuff that's never copied to the final image.
b) Revalidate if all the components that are copied into the final image are really needed in production. With Hive, for example, we switched the build to only build the metastore, which significantly reduced the attack surface. Some products consist of multiple components and plugins, which might not all be needed to run the platform.
c) While we're at it, try to generate an SBOM for each component that is copied into the final image (next to the component itself). For most components that should already be the case, see #814

We want to focus on products that are mostly affected by vulnerabilities right now:

  • Trino
  • Hive
  • HBase

Acceptance criteria:

  • Document what could be removed and the impacts of the removal
  • Document what can't be removed and why it can't be removed

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions