-
Notifications
You must be signed in to change notification settings - Fork 45
apis: add ServiceExport condition type/reasons #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
apis: add ServiceExport condition type/reasons #112
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: MrFreezeex The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Simiarly we can talk about it in the next sig meeting but feel free to review this initial proposal prior to that, thanks! |
1c1a161
to
3a1fc93
Compare
pkg/apis/v1alpha1/serviceexport.go
Outdated
ServiceExportValid = "Valid" | ||
// ServiceExportConflict means that there is a conflict between two | ||
// exports for the same Service. When "True", the condition message | ||
// should contain enough information to diagnose the conflict: | ||
// field(s) under contention, which cluster won, and why. | ||
// Users should not expect detailed per-cluster information in the | ||
// conflict message. | ||
// | ||
// Deprecated: use ServiceExportConditionConflict instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the existing constant names are fine but if the consensus is to rename them then just change the existing constants instead of deprecating. I think deprecating is overkill in this case. It's not a big deal to modify a constant name after bumping the mcs-api version. Also, if one has linting that flags the use of deprecated fields then it would have to be changed anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing names are fine theoretically but I would prefer to have some naming consistency among the new const that we define hence deprecating the old one basically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK but I would prefer not to deprecate for the reasons I outlined. At some point we would want to remove them so users would have to adjust anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me the removal of those should probably be in a new version of the apis (v1alpha2 or v1beta1) to try to be mindful of people importing the package and using those. If you have linting that flags deprecated I would tend to say that it actually is achieving what's expected: aka you should probably change your conditions usage 🤔.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me the removal of those should probably be in a new version of the apis (v1alpha2 or v1beta1)
Perhaps but that's the CRD version. These are just Go constants and any changes would be observed when you bump the Go dependency which could be done independent of a CRD version bump. It's not a big deal either way - just trying to avoid the hassle of handling deprecation down the road if we don't need to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea IIRC we've changed/removed constants like these in Gateway API in semver releases (e.g. a future v0.3.0), not tied to or requiring new CRD versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well for instance there's a bunch of deprecation here related to conditions: https://github.com/kubernetes-sigs/gateway-api/blob/e1310bbd66c54b757d28ee65cf363825f6189ca5/apis/v1/gateway_types.go
My reasoning is that when project would use a newer version of the CRD they will import some other go package meaning that they will already need to actively change some code anyway and thus we would be able to not carry over certain deprecation such as this one.
For a (small) go lib version bump though I don't think we should make users of this library change their code by dropping those const IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestions, overall looks like a great improvement!
97e6de5
to
df7f8bd
Compare
/lgtm |
This PR also renames the condition type names themselves, eg Another consideration wrt changing the type names is the conformance tests. They would still use the now deprecated constants and as soon as we change them to use the new ones, that will break implementations that haven't migrated to the new type names. Of course, we could modify the tests to look for both the old and new names for compatibility, at least for a while. |
Indeed thanks for highlighting that!
Sure I can update my "associated" KEP PR to also update the few places that refer to the old conditions. Although approvers/reviewers from the KEP and mcs-api repo are pretty much the same but it looks reasonable to look at this with more scrutiny than for instance adding more conformance tests. I don't think that we require a version bump here though as we only deprecate the old fields.
Yep good point that seems reasonable that as long (or at least for some time) as we support v1alpha1 in the conformance tests both the old and new conditions would be considered as conformant. |
yeah which is why I think it makes sense to change the condition type names in v1alpha2 along with the other proposed (potentially breaking) |
The thing is that those future |
pkg/apis/v1alpha1/serviceexport.go
Outdated
|
||
// ServiceExportReasonLabelsConflict is used with the "Conflicted" | ||
// condition when the exported service have a conflict related to labels. | ||
ServiceExportReasonLabelsConflict = "LabelsConflict" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be treated similar as PortConflict
.
ServiceExportReasonLabelsConflict = "LabelsConflict" | |
ServiceExportReasonLabelConflict = "LabelConflict" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Port conflict is treated individually whereas annotations and labels is an exact match of every labels which explains the plural here
pkg/apis/v1alpha1/serviceexport.go
Outdated
|
||
// ServiceExportReasonAnnotationsConflict is used with the "Conflicted" | ||
// condition when the exported service have a conflict related to annotations. | ||
ServiceExportReasonAnnotationsConflict = "AnnotationsConflict" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ServiceExportReasonAnnotationsConflict = "AnnotationsConflict" | |
ServiceExportReasonAnnotationConflict = "AnnotationConflict" |
pkg/apis/v1alpha1/serviceexport.go
Outdated
// ServiceExportConditionConflicted indicates that the controller was unable | ||
// to resolve conflict for a ServiceExport. This condition must be at | ||
// least raised on the conflicting ServiceExport and is recommended to | ||
// be raised on all on all the constituent `ServiceExport`s if feasible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It indicates there was some conflicting aspect between 2 or more exported services but it doesn't mean it wasn't resolved. Also I'm not sure the second statement re: where the condition should be raised is really relevant here.
// ServiceExportConditionConflicted indicates that the controller was unable | |
// to resolve conflict for a ServiceExport. This condition must be at | |
// least raised on the conflicting ServiceExport and is recommended to | |
// be raised on all on all the constituent `ServiceExport`s if feasible. | |
// ServiceExportConditionConflicted indicates that some property of an exported service has conflicting | |
// values across the constituent ServiceExports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I'm not sure the second statement re: where the condition should be raised is really relevant here.
I would say yes because we kind of duplicated the recommendation from the KEP in other places too, so to me it's in the similar vein as the rest
df7f8bd
to
bddd8b9
Compare
bddd8b9
to
dcc236e
Compare
dcc236e
to
9410124
Compare
/lgtm |
/lgtm |
// ServiceExportReasonNoService is used with the "Accepted" condition when | ||
// the associated Service does not exist. | ||
ServiceExportReasonNoService ServiceExportConditionReason = "NoService" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there potentially any RBAC reason that (controller can't read Service resources for some reason) that could cause this?
// ServiceExportReasonNoService is used with the "Accepted" condition when | |
// the associated Service does not exist. | |
ServiceExportReasonNoService ServiceExportConditionReason = "NoService" | |
// ServiceExportReasonNoService is used with the "Accepted" condition when | |
// the Service to be exported could not be found. | |
ServiceExportReasonServiceNotFound ServiceExportConditionReason = "ServiceNotFound" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a different failure reason. "NoService" is if the service is exported before the service object is created or if the ServiceExport is deleted before/without deleting the service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RBAC issues are extremely unlikely since we are most likely talking about a local controller talking to its local cluster and probably more a "core error" that would prevent the controller to be started or at least the controller logging a bunch of errors.
I don't have a strong opinion on ServiceNotFound vs NoService personally though both seems to be similar-ish to me
@@ -66,13 +66,17 @@ const ( | |||
// service export has been recognized as valid by an mcs-controller. | |||
// This will be false if the service is found to be unexportable | |||
// (ExternalName, not found). | |||
// | |||
// Deprecated: use ServiceExportConditionAccepted instead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we really want to deprecate the "Valid" condition? i did not see big difference between "Valid" and "Accepted", but deprecation means a lot of migration work involved, but not many benefits :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you actually read back the conditions to do something? If not I would expect very little migration effort needed here actually...
The conditions defined here predates Conditions being a Kubernetes type so it's a complete overhaul. We are also aligning ourselves with GatewayAPI types which use "Accepted" rather than "Valid"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It definitely will involve a lot of provider code changes and it will break all the tests.
Since we're still in v1alpha1, it should be our last chance to make it right.
// Controllers may raise this condition with other reasons, | ||
// but should prefer to use the reasons listed above to improve | ||
// interoperability. | ||
ServiceExportConditionConflicted ServiceExportConditionType = "Conflicted" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, not obvious benefits to change to "conflicted" compared with "Conflict"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For parity with Gateway API and the SIG-Architecture API conventions for status conditions, which I believe did not exist yet when these early conditions were initially added, which explicitly recommend past-tense verbs.
) | ||
|
||
const ( | ||
// ServiceExportConditionExported is true when the service is exported to some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before in the KEP, we treat the "Valid" true and "Conflict" false as "exported", which means it's visible outside of the local cluster.
What "Pending" and "Failed" mean may be related to "Import"?
Does it mean that it can be exported to every imported cluster? which it's a dynamic state and very hard to track.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exported doesn't mean imported everywhere, it just means that you pushed the info somewhere IMO.
Before in the KEP, we treat the "Valid" true and "Conflict" false as "exported"
Also if you have "Conflict" true it should most likely be still exported in most implementations/situations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean that it can be exported to every imported cluster? which it's a dynamic state and very hard to track.
Agreed that this semantics/state is likely impractical to express through the Exported
condition - whether each ServiceImport is successfully created should likely be messaged through a Ready
condition on each ServiceImport as proposed in #113
What "Pending" and "Failed" mean
From the perspective of a centralized implementation, my understanding is that Pending
could be both an initial state (mcs controller has not yet observed CRD) and potential in-progress state (CRD has been observed by an mcs-controller, which is doing some work before it considers the service exported and thus available to be imported by other clusters), without conveying whether corresponding ServiceImports have actually been created yet or are ready for use. I'm not entirely clear on if decentralized implementations (Cilium, Submariner) will have a use for this reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely clear on if decentralized implementations (Cilium, Submariner) will have a use for this reason.
Cilium nop as we don't really export anything technically. Submariner I am not sure, I think I grasped from @tpantelis comment that yes but I let him confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not quite sure whether this info "Exported doesn't mean imported everywhere, it just means that you pushed the info somewhere IMO." is useful to the person who want to expose the local cluster. and technically it's not easy to track, either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make the condition as a non-core condition? if you're not expecting every implementation to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! I added some text making this clear and explicit.
It's not actually defining concept of core/non core etc but I don't expect to have much other condition type (if any?) that would be defined here but not used by some implementations at the moment so I am not sure we really have to categorize them. We can revisit that later if there would much more of those though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submariner would use this condition and, in fact, already does except we currently name it Ready
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually maybe we could also consider using "Ready" rather than "Exported" here?
It would probably be very similar to the current thing but have a very generic "Ready" reason which can be true in addition to the exported one. That way Cilium (or any other implementations that does not export anything and rather pull) could use that with the "Ready" reason too and things that have some generic logic about conditions (for instance: https://github.com/kubernetes-sigs/cli-utils/tree/master/pkg/kstatus) could know about the resource condition with the Ready type for any implementations. What do you all think about this idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m fine with either one.
// * "PortConflict" | ||
// * "TypeConflict" | ||
// * "SessionAffinityConflict" | ||
// * "SessionAffinityConfigConflict" | ||
// * "AnnotationsConflict" | ||
// * "LabelsConflict" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, it's not easy to extend, later we may support other field conflicts.
How about just defining the "ConflictFound" and putting the details in the message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to have specific reasons IMO, you are free to use your own reasons in case your use case is not supported as stated in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having standardized reasons is very helpful for communicating specific known failure scenarios in a predictable way (whereas the message
field contents will differ by implementation) and is very helpful in conformance tests too.
We could consider adding an additional general-purpose Conflicted
reason for failure scenarios not covered by these reasons, but this field reason
is a string type) is extensible by implementations which wish to provide their own implementation-specific reasons as a supplment to the standard reasons.
We should expect (and maybe this is worth adding explicitly in docs here and/or KEP) that additional reasons may be defined by implementations or be added to the set of standard reasons in the future if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consider adding an additional general-purpose Conflicted reason for failure scenarios not covered by these reasons, but this field reason is a string type) is extensible by implementations which wish to provide their own implementation-specific reasons as a supplment to the standard reasons.
Adding a generic reason doesn't seems great. I would expect implementation to either use the provided one or make up their own. I don't see why you would use a generic one in this context...
We should expect (and maybe this is worth adding explicitly in docs here and/or KEP) that additional reasons may be defined by implementations or be added to the set of standard reasons in the future if needed.
It's actually already written each time we define some reasons thanks to GW-API "template" :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So will we validate the reason in the conformance tests if we expect the consumer may define their own?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conformance test generally present targeted conflict with one property conflicting at time so it should be relatively fine to make this into the conformance test. If there are test with multiple conflict we would probably not test the exact reason I would say. Also if you have particular situation that are specific to your implementation it would probably not be in the conformance test to begin with.
Signed-off-by: Arthur Outhenin-Chalandre <[email protected]>
9410124
to
0ad94d6
Compare
/lgtm |
Triage notes:
|
Proposal for ServiceExport conditon type/reasons
TODO:
suggested by @mikemorris here kubernetes/enhancements#5438 (comment) also this initial proposal is mainly based on GatewayAPI code adapted for our needs.