apis: add ServiceExport condition type/reasons #112

MrFreezeex · 2025-06-27T15:10:53Z

Proposal for ServiceExport conditon type/reasons

TODO:

get consensus among at least sig-multicluster folks participating in the discussion
adapt conformance test

suggested by @mikemorris here kubernetes/enhancements#5438 (comment) also this initial proposal is mainly based on GatewayAPI code adapted for our needs.

k8s-ci-robot · 2025-06-27T15:10:59Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: MrFreezeex
Once this PR has been reviewed and has the lgtm label, please assign skitt for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

MrFreezeex · 2025-06-27T15:12:14Z

/cc @mikemorris @tpantelis

Simiarly we can talk about it in the next sig meeting but feel free to review this initial proposal prior to that, thanks!

pkg/apis/v1alpha1/serviceexport.go

tpantelis · 2025-06-27T16:38:03Z

pkg/apis/v1alpha1/serviceexport.go

 	ServiceExportValid = "Valid"
 	// ServiceExportConflict means that there is a conflict between two
 	// exports for the same Service. When "True", the condition message
 	// should contain enough information to diagnose the conflict:
 	// field(s) under contention, which cluster won, and why.
 	// Users should not expect detailed per-cluster information in the
 	// conflict message.
+	//
+	// Deprecated: use ServiceExportConditionConflict instead


I think the existing constant names are fine but if the consensus is to rename them then just change the existing constants instead of deprecating. I think deprecating is overkill in this case. It's not a big deal to modify a constant name after bumping the mcs-api version. Also, if one has linting that flags the use of deprecated fields then it would have to be changed anyway.

The existing names are fine theoretically but I would prefer to have some naming consistency among the new const that we define hence deprecating the old one basically

OK but I would prefer not to deprecate for the reasons I outlined. At some point we would want to remove them so users would have to adjust anyway.

To me the removal of those should probably be in a new version of the apis (v1alpha2 or v1beta1) to try to be mindful of people importing the package and using those. If you have linting that flags deprecated I would tend to say that it actually is achieving what's expected: aka you should probably change your conditions usage 🤔.

To me the removal of those should probably be in a new version of the apis (v1alpha2 or v1beta1)

Perhaps but that's the CRD version. These are just Go constants and any changes would be observed when you bump the Go dependency which could be done independent of a CRD version bump. It's not a big deal either way - just trying to avoid the hassle of handling deprecation down the road if we don't need to.

Yea IIRC we've changed/removed constants like these in Gateway API in semver releases (e.g. a future v0.3.0), not tied to or requiring new CRD versions.

Well for instance there's a bunch of deprecation here related to conditions: https://github.com/kubernetes-sigs/gateway-api/blob/e1310bbd66c54b757d28ee65cf363825f6189ca5/apis/v1/gateway_types.go

My reasoning is that when project would use a newer version of the CRD they will import some other go package meaning that they will already need to actively change some code anyway and thus we would be able to not carry over certain deprecation such as this one.

For a (small) go lib version bump though I don't think we should make users of this library change their code by dropping those const IMO.

mikemorris

Minor suggestions, overall looks like a great improvement!

pkg/apis/v1alpha1/serviceexport.go

mikemorris · 2025-06-27T21:33:36Z

/lgtm

tpantelis · 2025-06-27T21:50:48Z

@MrFreezeex @mikemorris

This PR also renames the condition type names themselves, eg Valid -> Accepted. I'm fine with the new names but that is really an API change that has ramifications beyond a simple Go constant. If we expect implementations to use the new ones, they will likely need to handle migration to clean up the old conditions. It seems to me, based on comments/discussions elsewhere, that this should probably be run thru the KEP first and be part of the larger v1beta1 graduation scope, in which case we should just keep the scope of this PR to defining Reason names.

Another consideration wrt changing the type names is the conformance tests. They would still use the now deprecated constants and as soon as we change them to use the new ones, that will break implementations that haven't migrated to the new type names. Of course, we could modify the tests to look for both the old and new names for compatibility, at least for a while.

MrFreezeex · 2025-06-27T22:23:19Z

This PR also renames the condition type names themselves, eg Valid -> Accepted. I'm fine with the new names but that is really an API change that has ramifications beyond a simple Go constant. If we expect implementations to use the new ones, they will likely need to handle migration to clean up the old conditions.

Indeed thanks for highlighting that!

It seems to me, based on comments/discussions elsewhere, that this should probably be run thru the KEP first and be part of the larger v1beta1 graduation scope, in which case we should just keep the scope of this PR to defining Reason names.

Sure I can update my "associated" KEP PR to also update the few places that refer to the old conditions. Although approvers/reviewers from the KEP and mcs-api repo are pretty much the same but it looks reasonable to look at this with more scrutiny than for instance adding more conformance tests.

I don't think that we require a version bump here though as we only deprecate the old fields.

Another consideration wrt changing the type names is the conformance tests. They would still use the now deprecated constants and as soon as we change them to use the new ones, that will break implementations that haven't migrated to the new type names. Of course, we could modify the tests to look for both the old and new names for compatibility, at least for a while.

Yep good point that seems reasonable that as long (or at least for some time) as we support v1alpha1 in the conformance tests both the old and new conditions would be considered as conformant.

tpantelis · 2025-06-27T22:36:19Z

Yep good point that seems reasonable that as long (or at least for some time) as we support v1alpha1 ...

yeah which is why I think it makes sense to change the condition type names in v1alpha2 along with the other proposed (potentially breaking) ServiceImport changes.

MrFreezeex · 2025-06-27T22:40:47Z

Yep good point that seems reasonable that as long (or at least for some time) as we support v1alpha1 ...

yeah which is why I think it makes sense to change the condition type names in v1alpha2 along with the other proposed (potentially breaking) ServiceImport changes.

The thing is that those future ServiceImport breaking changes are actually not breaking if we actually do this "mirror fields" approach. That being said you are right that it should be treated similarly indeed, so we should decide in what API version those ServiceImport "breaking changes" would go first.

pkg/apis/v1alpha1/serviceexport.go

tpantelis · 2025-06-27T23:07:39Z

pkg/apis/v1alpha1/serviceexport.go

+
+	// ServiceExportReasonLabelsConflict is used with the "Conflicted"
+	// condition when the exported service have a conflict related to labels.
+	ServiceExportReasonLabelsConflict = "LabelsConflict"


I think this should be treated similar as PortConflict.

Suggested change

ServiceExportReasonLabelsConflict = "LabelsConflict"

ServiceExportReasonLabelConflict = "LabelConflict"

Port conflict is treated individually whereas annotations and labels is an exact match of every labels which explains the plural here

tpantelis · 2025-06-27T23:07:51Z

pkg/apis/v1alpha1/serviceexport.go

+
+	// ServiceExportReasonAnnotationsConflict is used with the "Conflicted"
+	// condition when the exported service have a conflict related to annotations.
+	ServiceExportReasonAnnotationsConflict = "AnnotationsConflict"


Suggested change

ServiceExportReasonAnnotationsConflict = "AnnotationsConflict"

ServiceExportReasonAnnotationConflict = "AnnotationConflict"

tpantelis · 2025-06-27T23:22:37Z

pkg/apis/v1alpha1/serviceexport.go

+	// ServiceExportConditionConflicted indicates that the controller was unable
+	// to resolve conflict for a ServiceExport. This condition must be at
+	// least raised on the conflicting ServiceExport and is recommended to
+	// be raised on all on all the constituent `ServiceExport`s if feasible.


It indicates there was some conflicting aspect between 2 or more exported services but it doesn't mean it wasn't resolved. Also I'm not sure the second statement re: where the condition should be raised is really relevant here.

Suggested change

// ServiceExportConditionConflicted indicates that the controller was unable

// to resolve conflict for a ServiceExport. This condition must be at

// least raised on the conflicting ServiceExport and is recommended to

// be raised on all on all the constituent `ServiceExport`s if feasible.

// ServiceExportConditionConflicted indicates that some property of an exported service has conflicting

// values across the constituent ServiceExports.

Also I'm not sure the second statement re: where the condition should be raised is really relevant here.

I would say yes because we kind of duplicated the recommendation from the KEP in other places too, so to me it's in the similar vein as the rest

pkg/apis/v1alpha1/serviceexport.go

tpantelis · 2025-06-30T13:56:28Z

/lgtm

mikemorris · 2025-06-30T16:35:43Z

/lgtm

mikemorris · 2025-06-30T16:37:55Z

pkg/apis/v1alpha1/serviceexport.go

+	// ServiceExportReasonNoService is used with the "Accepted" condition when
+	// the associated Service does not exist.
+	ServiceExportReasonNoService ServiceExportConditionReason = "NoService"


Is there potentially any RBAC reason that (controller can't read Service resources for some reason) that could cause this?

Suggested change

// ServiceExportReasonNoService is used with the "Accepted" condition when

// the associated Service does not exist.

ServiceExportReasonNoService ServiceExportConditionReason = "NoService"

// ServiceExportReasonNoService is used with the "Accepted" condition when

// the Service to be exported could not be found.

ServiceExportReasonServiceNotFound ServiceExportConditionReason = "ServiceNotFound"

That would be a different failure reason. "NoService" is if the service is exported before the service object is created or if the ServiceExport is deleted before/without deleting the service.

RBAC issues are extremely unlikely since we are most likely talking about a local controller talking to its local cluster and probably more a "core error" that would prevent the controller to be started or at least the controller logging a bunch of errors.

I don't have a strong opinion on ServiceNotFound vs NoService personally though both seems to be similar-ish to me

zhiying-lin · 2025-07-01T08:58:52Z

pkg/apis/v1alpha1/serviceexport.go

@@ -66,13 +66,17 @@ const (
 	// service export has been recognized as valid by an mcs-controller.
 	// This will be false if the service is found to be unexportable
 	// (ExternalName, not found).
+	//
+	// Deprecated: use ServiceExportConditionAccepted instead


do we really want to deprecate the "Valid" condition? i did not see big difference between "Valid" and "Accepted", but deprecation means a lot of migration work involved, but not many benefits :(

Do you actually read back the conditions to do something? If not I would expect very little migration effort needed here actually...

The conditions defined here predates Conditions being a Kubernetes type so it's a complete overhaul. We are also aligning ourselves with GatewayAPI types which use "Accepted" rather than "Valid"

It definitely will involve a lot of provider code changes and it will break all the tests.

Since we're still in v1alpha1, it should be our last chance to make it right.

zhiying-lin · 2025-07-01T09:01:08Z

pkg/apis/v1alpha1/serviceexport.go

+	// Controllers may raise this condition with other reasons,
+	// but should prefer to use the reasons listed above to improve
+	// interoperability.
+	ServiceExportConditionConflicted ServiceExportConditionType = "Conflicted"


ditto, not obvious benefits to change to "conflicted" compared with "Conflict"

For parity with Gateway API and the SIG-Architecture API conventions for status conditions, which I believe did not exist yet when these early conditions were initially added, which explicitly recommend past-tense verbs.

zhiying-lin · 2025-07-01T09:14:34Z

pkg/apis/v1alpha1/serviceexport.go

+)
+
+const (
+	// ServiceExportConditionExported is true when the service is exported to some


Before in the KEP, we treat the "Valid" true and "Conflict" false as "exported", which means it's visible outside of the local cluster.

What "Pending" and "Failed" mean may be related to "Import"?
Does it mean that it can be exported to every imported cluster? which it's a dynamic state and very hard to track.

Exported doesn't mean imported everywhere, it just means that you pushed the info somewhere IMO.

Before in the KEP, we treat the "Valid" true and "Conflict" false as "exported"

Also if you have "Conflict" true it should most likely be still exported in most implementations/situations

Does it mean that it can be exported to every imported cluster? which it's a dynamic state and very hard to track.

Agreed that this semantics/state is likely impractical to express through the Exported condition - whether each ServiceImport is successfully created should likely be messaged through a Ready condition on each ServiceImport as proposed in #113

What "Pending" and "Failed" mean

From the perspective of a centralized implementation, my understanding is that Pending could be both an initial state (mcs controller has not yet observed CRD) and potential in-progress state (CRD has been observed by an mcs-controller, which is doing some work before it considers the service exported and thus available to be imported by other clusters), without conveying whether corresponding ServiceImports have actually been created yet or are ready for use. I'm not entirely clear on if decentralized implementations (Cilium, Submariner) will have a use for this reason.

I'm not entirely clear on if decentralized implementations (Cilium, Submariner) will have a use for this reason.

Cilium nop as we don't really export anything technically. Submariner I am not sure, I think I grasped from @tpantelis comment that yes but I let him confirm.

i'm not quite sure whether this info "Exported doesn't mean imported everywhere, it just means that you pushed the info somewhere IMO." is useful to the person who want to expose the local cluster. and technically it's not easy to track, either.

Should we make the condition as a non-core condition? if you're not expecting every implementation to do that.

Sure! I added some text making this clear and explicit.

It's not actually defining concept of core/non core etc but I don't expect to have much other condition type (if any?) that would be defined here but not used by some implementations at the moment so I am not sure we really have to categorize them. We can revisit that later if there would much more of those though

Submariner would use this condition and, in fact, already does except we currently name it Ready.

Actually maybe we could also consider using "Ready" rather than "Exported" here?

It would probably be very similar to the current thing but have a very generic "Ready" reason which can be true in addition to the exported one. That way Cilium (or any other implementations that does not export anything and rather pull) could use that with the "Ready" reason too and things that have some generic logic about conditions (for instance: https://github.com/kubernetes-sigs/cli-utils/tree/master/pkg/kstatus) could know about the resource condition with the Ready type for any implementations. What do you all think about this idea?

I’m fine with either one.

zhiying-lin · 2025-07-01T09:18:02Z

pkg/apis/v1alpha1/serviceexport.go

+	// * "PortConflict"
+	// * "TypeConflict"
+	// * "SessionAffinityConflict"
+	// * "SessionAffinityConfigConflict"
+	// * "AnnotationsConflict"
+	// * "LabelsConflict"


hmm, it's not easy to extend, later we may support other field conflicts.

How about just defining the "ConflictFound" and putting the details in the message?

I would prefer to have specific reasons IMO, you are free to use your own reasons in case your use case is not supported as stated in this file.

Having standardized reasons is very helpful for communicating specific known failure scenarios in a predictable way (whereas the message field contents will differ by implementation) and is very helpful in conformance tests too.

We could consider adding an additional general-purpose Conflicted reason for failure scenarios not covered by these reasons, but this field reason is a string type) is extensible by implementations which wish to provide their own implementation-specific reasons as a supplment to the standard reasons.

We should expect (and maybe this is worth adding explicitly in docs here and/or KEP) that additional reasons may be defined by implementations or be added to the set of standard reasons in the future if needed.

We could consider adding an additional general-purpose Conflicted reason for failure scenarios not covered by these reasons, but this field reason is a string type) is extensible by implementations which wish to provide their own implementation-specific reasons as a supplment to the standard reasons.

Adding a generic reason doesn't seems great. I would expect implementation to either use the provided one or make up their own. I don't see why you would use a generic one in this context...

We should expect (and maybe this is worth adding explicitly in docs here and/or KEP) that additional reasons may be defined by implementations or be added to the set of standard reasons in the future if needed.

It's actually already written each time we define some reasons thanks to GW-API "template" :D

So will we validate the reason in the conformance tests if we expect the consumer may define their own?

The conformance test generally present targeted conflict with one property conflicting at time so it should be relatively fine to make this into the conformance test. If there are test with multiple conflict we would probably not test the exact reason I would say. Also if you have particular situation that are specific to your implementation it would probably not be in the conformance test to begin with.

pkg/apis/v1alpha1/serviceexport.go

Signed-off-by: Arthur Outhenin-Chalandre <[email protected]>

tpantelis · 2025-07-02T10:59:59Z

/lgtm

lauralorenz · 2025-07-22T17:08:01Z

Triage notes:

several lgtms and looks like mostly is about the new things, which would not be considered a breaking change
proposed in comments: new name Exported -> Ready for a certain condition may not be worth it because that's more breaking
similarly, as proposed in PR: deprecating "Valid" and "Conflict" for these other well-formed names would be considered breaking (see below)

// Deprecated: use ServiceExportConditionAccepted instead
ServiceExportValid = "Valid"

// Deprecated: use ServiceExportConditionConflicted instead
	ServiceExportConflict = "Conflict"

comment that renaming is not worth it - discussed the technical needs (condition is set in one place, have a SetUpdateCondition method), discussed Gateway API experience (who did do this). will not hard block on it
can we reduce the scope to not renaming things and only make it additive (to add the reasons)?
if we don't make the deprecations now, we add work for implementations that they would have to be undone later IF we choose to rename later (though there is an argument here to never rename)
THE PLAN! arthur will update the PR to keep the existing strings, but still add new constants for them with more explicit names, and put deprecation flags on the old ones. the important part is that the old string values won't change compared to what they have already for the two reasons that exist

k8s-ci-robot requested a review from JeremyOT June 27, 2025 15:10

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 27, 2025

k8s-ci-robot requested a review from lauralorenz June 27, 2025 15:10

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 27, 2025

k8s-ci-robot requested review from mikemorris and tpantelis June 27, 2025 15:12

MrFreezeex force-pushed the svcexport-conditions-type branch 4 times, most recently from 1c1a161 to 3a1fc93 Compare June 27, 2025 15:45

MrFreezeex mentioned this pull request Jun 27, 2025

KEP 1645: fix ServiceExport conditions kubernetes/enhancements#5438

Open

MrFreezeex commented Jun 27, 2025

View reviewed changes

pkg/apis/v1alpha1/serviceexport.go Outdated Show resolved Hide resolved

tpantelis reviewed Jun 27, 2025

View reviewed changes

mikemorris reviewed Jun 27, 2025

View reviewed changes

MrFreezeex force-pushed the svcexport-conditions-type branch 13 times, most recently from 97e6de5 to df7f8bd Compare June 27, 2025 21:11

k8s-ci-robot assigned mikemorris Jun 27, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 27, 2025

tpantelis reviewed Jun 27, 2025

View reviewed changes

MrFreezeex force-pushed the svcexport-conditions-type branch from df7f8bd to bddd8b9 Compare June 30, 2025 08:53

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 30, 2025

MrFreezeex force-pushed the svcexport-conditions-type branch from bddd8b9 to dcc236e Compare June 30, 2025 08:56

tpantelis reviewed Jun 30, 2025

View reviewed changes

pkg/apis/v1alpha1/serviceexport.go Outdated Show resolved Hide resolved

MrFreezeex force-pushed the svcexport-conditions-type branch from dcc236e to 9410124 Compare June 30, 2025 11:42

k8s-ci-robot assigned tpantelis Jun 30, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 30, 2025

mikemorris reviewed Jun 30, 2025

View reviewed changes

zhiying-lin reviewed Jul 1, 2025

View reviewed changes

MrFreezeex mentioned this pull request Jul 1, 2025

Clustermesh APIserver: Add CFP to filter ciliumendpoints, identities and endpointslices exported to etcd cilium/design-cfps#74

Open

apis: add ServiceExport condition type/reasons

0ad94d6

Signed-off-by: Arthur Outhenin-Chalandre <[email protected]>

MrFreezeex force-pushed the svcexport-conditions-type branch from 9410124 to 0ad94d6 Compare July 2, 2025 09:43

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 2, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 2, 2025

	ServiceExportReasonLabelsConflict = "LabelsConflict"
	ServiceExportReasonLabelConflict = "LabelConflict"

	ServiceExportReasonAnnotationsConflict = "AnnotationsConflict"
	ServiceExportReasonAnnotationConflict = "AnnotationConflict"

apis: add ServiceExport condition type/reasons #112

Are you sure you want to change the base?

apis: add ServiceExport condition type/reasons #112

Conversation

MrFreezeex commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Jun 27, 2025

Uh oh!

MrFreezeex commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MrFreezeex Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MrFreezeex Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikemorris left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mikemorris commented Jun 27, 2025

Uh oh!

tpantelis commented Jun 27, 2025

Uh oh!

MrFreezeex commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tpantelis commented Jun 27, 2025

Uh oh!

MrFreezeex commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MrFreezeex Jun 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tpantelis commented Jun 30, 2025

MrFreezeex commented Jun 27, 2025 •

edited

Loading

MrFreezeex commented Jun 27, 2025 •

edited

Loading

MrFreezeex Jun 27, 2025 •

edited

Loading

MrFreezeex Jun 27, 2025 •

edited

Loading

MrFreezeex commented Jun 27, 2025 •

edited

Loading

MrFreezeex commented Jun 27, 2025 •

edited

Loading

MrFreezeex Jun 30, 2025 •

edited

Loading

MrFreezeex Jul 1, 2025 •

edited

Loading

mikemorris Jul 1, 2025 •

edited

Loading

MrFreezeex Jul 8, 2025 •

edited

Loading