Skip to content

Conversation

stealthybox
Copy link

@stealthybox stealthybox commented Apr 22, 2025

This allows users to pass emitStringData: true to the secretGenerator, alongside the type option.
When enabled, UTF-8 strings are output in plainText stringData, and non-UTF strings and any binary data fallback to the default behavior of being base64 encoded in data.

This is very similar to the default, kyaml behavior of loading values into ConfigMap's data and binaryData fields.

This feature provides a general U/X improvement for people using kustomize interactively, and also allows Flux users to template into generated secrets.
(Flux users already can and do template into ConfigMaps, so we want to give people more secure mechanisms in kustomize-controller)

  • test: add secretGenerator testcases
  • feat: support emitStringData in secretGenerator
  • test: test emitStringData in secretGenerator

resolves #5142 #1444 #1261 #793

@k8s-ci-robot
Copy link
Contributor

This PR has multiple commits, and the default merge method is: merge.
You can request commits to be squashed using the label: tide/merge-method-squash

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 22, 2025
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Apr 22, 2025
@stealthybox stealthybox force-pushed the stringdata-secret-gen branch from 14da6ae to 905d18d Compare April 22, 2025 06:24
@stealthybox stealthybox changed the title Support outputting stringData from secretGenerator feat: Support outputting stringData from secretGenerator Apr 23, 2025
@koba1t
Copy link
Member

koba1t commented Apr 30, 2025

/assign

Copy link

@matheuscscp matheuscscp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: matheuscscp, stealthybox
Once this PR has been reviewed and has the lgtm label, please ask for approval from koba1t. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@stealthybox stealthybox force-pushed the stringdata-secret-gen branch from 905d18d to 9e14bdb Compare June 9, 2025 19:09
@stealthybox
Copy link
Author

I've just done a fresh rebase.
@koba1t, is there any way I can help out in reviewing this patch?
Thanks :)

@koba1t
Copy link
Member

koba1t commented Jun 10, 2025

@stealthybox
So, Sorry for the late review.

  • The field name stringData: true is associated with the field name of a secret resource, but is it possible to change it to a more descriptive field name, such as noEncode: true?
  • As a test scenario, could you verify that when adding a secondary secretGenerator with stringData: false to a secretGenerator configured with stringData: true using the behavior: merge option, the resulting data is properly base64 encoded?

@matheuscscp
Copy link

How about outputStringData?

@koba1t
Copy link
Member

koba1t commented Jun 11, 2025

@matheuscscp
What I'm trying to say is that field names should be clear enough that users can immediately understand how they'll behave when seeing them in this PR.
The name stringData likely comes from the existence of a stringData field in secret resources, but it doesn't actually explain the behavior in kustomize.

For example, in the case of stringData, it's not clear whether this refers to string input or what exactly is being stringified, or in the case of outputStringData, it's not clear whether this controls the output destination or something else.

@stealthybox stealthybox force-pushed the stringdata-secret-gen branch from 9e14bdb to 2e90352 Compare July 24, 2025 17:56
@stealthybox stealthybox force-pushed the stringdata-secret-gen branch 2 times, most recently from bab4e87 to 67fc288 Compare July 24, 2025 18:03
@stealthybox
Copy link
Author

stealthybox commented Jul 24, 2025

could you verify that when adding a secondary secretGenerator with stringData: false to a secretGenerator configured with stringData: true using the behavior: merge option, the resulting data is properly base64 encoded?

@koba1t, thanks for pointing out this out.
I've added tests for the following cases in test: field overrides for configMapGenerator + secretGenerator
67fc288:

    1. merge encoded data fields onto a generated secret with stringData
    1. override encoded data fields with stringData for secretGenerator
      producing a Secret that successfully uses the new value in the Kubernetes API
    1. override stringData fields with encoded data for secretGenerator
      producing a Secret that silently fails to use the new value in the Kubernetes API
    1. override encoded binaryData fields with data for configMapGenerator
      producing an invalid ConfigMap
    1. override data fields with encoded binaryData for configMapGenerator
      producing an invalid ConfigMap

The 4 cases where fields override using a different encoding output duplicate fields.
This is existing ConfigMapGenerator behavior that is undesirable.
This existing buggy behavior becomes new Secret-gen behavior.
I can fix both of these in a followup PR as discussed privately in our Slack dm.

Here is an example of what that fix could look like:
stealthybox@e95e854

@stealthybox
Copy link
Author

stealthybox commented Jul 24, 2025

Regarding the field name, I understand the concern of stringData as a bool not fully expressing the behavior.
I would hope the doc-string helps clarify this beyond just the name.

In the most recent issue,
an end-user requests a stringData bool in the generatorOptions. #5142 (comment),
@seh and @annasong20 confirm the boolean field stringData should be added to SecretArgs and exposed directly on secretGenerator. #5142 (comment)

A separate end-user suggests the boolean field should be called useStringData. #5142 (comment)

The title of unrelated issue (#1261) is "Support StringData option in SecretArgs".

Considering end-user, contributor, and maintainer responses, stringData seems to be the agreed upon field name that is most natural. I think the word "stringData" must at least be contained within the field name.

noEncode implies that there isn't a fallback encoding behavior when there is.
I don't think noEncode works.

I'd also challenge that stringData doesn't make sense.
Most of the fields on generator options across the project are passthrough fields that end up in the output:

  • labels
  • annotations
  • immutable
  • type (for Secrets only -- defaults to "Opaque" when not supplied)

The new field isn't a passthrough value field -- it changes generator behavior. disableNameSuffixHash is a similar behavioral field that describes the generated output implicitly.
That's why secretGenerator[].stringData = true in my head automatically talks about the output of the generator.

in the case of stringData, it's not clear whether this refers to string input or what exactly is being stringified

This is fair. there are a lot of fields when you consider input fields like env, envs ,files, and literals mixing with output fields like type and annotations and behavioral fields like behavior and disableNameSuffixHash.
Disambiguation would need to come from the doc-string and examples.
I'm happy to add examples to:

in the case of outputStringData, it's not clear whether this controls the output destination or something else.

I disagree that outputStringData is unclear. There is only one output of a generator.

I think these are decent options for field names:

  • stringData
  • outputStringData
  • generateStringData
  • setStringData
  • preferStringDataOutput
  • disableBase64ForStringData

The default behavior of a configMapGenerator already includes heuristic auto-encoding behavior for binaryData, and that doesn't show up in any field names. It's just the way a kustomize generator works.

Thanks for considering these points.
I don't want to be overly pedantic, and I know the behavior is complex.
Naming this field is tricky.
I think examples and the doc-strings on the field name should help people understand what it does exactly.
I don't think we should give this field any name that inaccurately implies that it performs or prevents behaviors that it does not. Because the behavior is complex, the name either needs to be somewhat ambiguous, or very detailed and lengthy.

Do you like stringData or any of the other names I suggested?

@matheuscscp
Copy link

Yeah this field has to explain itself, it has to have stringData on its name somehow. Anything else would be garbage for explaining what it does.

@stealthybox
Copy link
Author

unrelated to field names, here's one way to fix the existing broken configMapGenerator behavior:
stealthybox@e95e854
I would send this in a followup PR.

@seh
Copy link
Contributor

seh commented Jul 24, 2025

I think "stringData" would be a deceiving name for this new field. My suggestion today is "emitStringData".

@stealthybox stealthybox force-pushed the stringdata-secret-gen branch from 67fc288 to c0b3af1 Compare July 24, 2025 20:23
@stealthybox
Copy link
Author

stealthybox commented Jul 24, 2025

I've proactively renamed the field to emitStringData in the types, docs, and test names.
Tests are passing.
Now that the field name is disambiguated from the stringData fields in the test output and docstrings, it's easy to rename it to whatever we decide 👍.

@stealthybox stealthybox changed the title feat: Support outputting stringData from secretGenerator feat: emitStringData boolean for secretGenerator Jul 24, 2025
@stealthybox
Copy link
Author

@koba1t and I considered in our Slack dm's whether a string value field could work:

  field: "data" # "stringData" | "binaryData"

This could be a generic way to solve this problem for both ConfigMap and Secret generators, but it also creates more error states and is harder to grep for in existing code + discover in auto-complete.

I prefer the emitStringData boolean field.
If we want support for explicit encoding in ConfigMap generator, I propose we do so by adding emitBinaryData as a boolean in another PR.

@stealthybox
Copy link
Author

Hi friends, if there's no further contest on the field name, I'd love to get this reviewed for a merge.
I'm happy to write some docs examples.

I want to draft a change for the merge behavior bugfix in stealthybox@e95e854 .

After these changes make their way into kustomize, kubectl needs to update kustomize before we can pull this feature into Flux.
I'd like to be able to do that before KubeCon NA and the Flux 2.7 release.

cc @koba1t @seh

Copy link
Contributor

@seh seh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not reviewed the secrets_test.go file yet.

Comment on lines +31 to +35
// A Secret's `stringData` field is similar to ConfigMap's `data` field.
// `stringData` allows specifying non-binary, UTF-8 secret data in string form.
// It is provided as a write-only input field for convenience.
// All keys and values are merged into the data field on write, overwriting any
// existing values. The stringData field is never output when reading from the API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Thank you for this.

}
} else {
if err = rn.LoadMapIntoSecretData(m); err != nil {
return nil, fmt.Errorf("Failed to load map into Secret data: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise with the capitalization.

return nil, err
if args.EmitStringData {
if err = rn.LoadMapIntoSecretStringData(m); err != nil {
return nil, fmt.Errorf("Failed to load map into Secret stringData: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether kustomize eschews the general advice for Go programs to not capitalize the first word in an error message, assuming that this message might wind up serving as a suffix for more prefixes to be added onto it higher up in the call stack. If this capitalization isn't us playing along with the rest of the code base, consider this alternative:

Suggested change
return nil, fmt.Errorf("Failed to load map into Secret stringData: %w", err)
return nil, fmt.Errorf(`loading map into Secret "stringData" field: %w`, err)

The message adopts the advice from Preslav Rachev's essay Go's Error Handling Is a Form of Storytelling.

`,
},
},
"construct secret from a binary file and fallback to data from stringData": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"construct secret from a binary file and fallback to data from stringData": {
"construct secret from a binary file and fall back to data from stringData": {

}

func (rn *RNode) LoadMapIntoSecretStringData(m map[string]string) error {
for _, k := range SortedMapKeys(m) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the significance of walking the keys in lexicographical order? Is that to avoid any nondeterminism? Since the key names must be unique, they can't conflict with one another in the "m" map, but perhaps we're wary of what the SetField function will do against the RNode's existing fields. I'm still not sure if that's a problem, though.

If you walked the map by its keys and values together, you could avoid the index expression on the next line.

return nil
}

// valid UTF-8 strings can be stringData, but fallback to Base64 for non UTF-8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the documentation capitalizes "Base64" inconsistently, often writing it as all-lowercase.

Suggested change
// valid UTF-8 strings can be stringData, but fallback to Base64 for non UTF-8
// makeSecretStringValueRNode creates a scalar node for one of two potential fields.
// If the supplied node's value is a valid UTF-8 string, it use the "stringData" field, but
// otherwise falls back to placing the Base64-encoded value in the "data" field for
// non-UTF-8 strings.

if err != nil {
return nil
}
result := map[string]string{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd like to know how many fields are present ahead of time. We should not call (*RNode).Fields, because (*RNode).VisitFields calls it immediately, and Fields allocates.

Should we estimate a likely count here, so that we could write the following?

Suggested change
result := map[string]string{}
result := make(map[string]string, 10) // Estimate field count

Perhaps the cost of starting with a too-small map and needing it to grow later isn't a real problem.

I see that this is mimicking the longstanding (*RNode).GetDataMap method, so if it's not a problem there, it's probably not a problem here either.


func (rn *RNode) SetStringDataMap(m map[string]string) {
if rn == nil {
log.Fatal("cannot set stringData map on nil Rnode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change this here, we should change it in (*RNode).SetDataMap and (*RNode).SetBinaryDataMap consistently.

Suggested change
log.Fatal("cannot set stringData map on nil Rnode")
log.Fatal("cannot set stringData map on nil RNode")


* **emitStringData** (bool), optional

emitStringData if true generates a v1/Secret with plain-text stringData fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend that we adapt the proposed struct field documentation here. I see that you made them consistent originally, so I'm just reminding you to update this paragraph if you decide to update the struct field documentation.

- year=2025
- crisis=true
`)
th.WriteF("service.yaml", `
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the significance of including the Service in this kustomization?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just that the generator properly appends to other resources in the output.
This test is a functional copy of TestGeneratorIntVsStringNoMerge from configmaps_test.go but for secrets. All of the copied test-cases are added in their own commit before the new ones from this patch are added.

`)
}

// Generate Secrets similar to TestGeneratorBasics with emitStringData enabled and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to finish this sentence? It ends with a dangling "and".

Comment on lines +206 to +207
// The resulting Secret will have a duplicate key in both data and stringData
// The stringData override will work, because the kube API considers stringData authoritative and write-only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The resulting Secret will have a duplicate key in both data and stringData
// The stringData override will work, because the kube API considers stringData authoritative and write-only
// The resulting Secret will have a duplicate key in both its "data" and "stringData" fields.
// The "stringData" field's override will work, because the Kubernetes API considers the "stringData" field to be authoritative.

Comment on lines +251 to +252
// The resulting Secret will have a duplicate key in both data and stringData
// The data override will fail, because the kube API considers the older value in stringData authoritative and write-only
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The resulting Secret will have a duplicate key in both data and stringData
// The data override will fail, because the kube API considers the older value in stringData authoritative and write-only
// The resulting Secret will have a duplicate key in both its "data" and "stringData" fields.
// The "data" field's override will fail, because the Kubernetes API considers the sibling value in the "stringData" field to be authoritative.

Comment on lines +428 to +430
// TODO: This should be an error instead. However, we can't strict unmarshal until we have a yaml
// lib that support case-insensitive keys and anchors.
// See https://github.com/kubernetes-sigs/kustomize/issues/5061
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// TODO: This should be an error instead. However, we can't strict unmarshal until we have a yaml
// lib that support case-insensitive keys and anchors.
// See https://github.com/kubernetes-sigs/kustomize/issues/5061
// TODO: This should be an error instead. However, we can't unmarshal strictly until
// we have a YAML library that support case-insensitive keys and anchors.
// See https://github.com/kubernetes-sigs/kustomize/issues/5061.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

secretGenerator to generate Secret with stringData manifest
5 participants