-
Notifications
You must be signed in to change notification settings - Fork 579
[RayCluster] Toggle usage of deterministic/non-deterministic head pod name with feature flag #3873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RayCluster] Toggle usage of deterministic/non-deterministic head pod name with feature flag #3873
Conversation
Signed-off-by: machichima <[email protected]>
Signed-off-by: machichima <[email protected]>
Signed-off-by: machichima <[email protected]>
Signed-off-by: machichima <[email protected]>
Signed-off-by: machichima <[email protected]>
Signed-off-by: machichima <[email protected]>
Signed-off-by: machichima <[email protected]>
Signed-off-by: machichima <[email protected]>
Signed-off-by: machichima <[email protected]>
@rueian PTAL, thank you! |
I added the test in |
enableDeterministicHeadPodName := false | ||
if s := os.Getenv(ENABLE_DETERMINISTIC_HEAD_POD_NAME); strings.ToLower(s) == "true" { | ||
enableDeterministicHeadPodName = true | ||
} | ||
return enableDeterministicHeadPodName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enableDeterministicHeadPodName := false | |
if s := os.Getenv(ENABLE_DETERMINISTIC_HEAD_POD_NAME); strings.ToLower(s) == "true" { | |
enableDeterministicHeadPodName = true | |
} | |
return enableDeterministicHeadPodName | |
return strings.ToLower(os.Getenv(ENABLE_DETERMINISTIC_HEAD_POD_NAME)) == "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Thanks!
We can have an e2e test for the autoscaler restart with GCS FT in another PR, since it has nothing to do with the feature flag. |
Signed-off-by: machichima <[email protected]>
Signed-off-by: machichima <[email protected]>
@rueian I've resolved the conflicts. PTAL, thank you! |
# If set to true, we will use deterministic name for head pod. Otherwise, the non-deterministic name is used. | ||
# - name: ENABLE_DETERMINISTIC_HEAD_POD_NAME | ||
# value: "false" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default we should enable this feature to fix this issue:
#3013
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I missed this PR:
#3872
… name with feature flag (ray-project#3873) * feat: use gen name for ray version > 2.48 Signed-off-by: machichima <[email protected]> * feat: head pod name to non deterministic Signed-off-by: machichima <[email protected]> * test: compare head pod name between old and new Signed-off-by: machichima <[email protected]> * feat: add feature flag for deterministic head pod name Signed-off-by: machichima <[email protected]> * feat: use deterministic if feature flag set to true Signed-off-by: machichima <[email protected]> * test: head pod name with/without feature flag == true Signed-off-by: machichima <[email protected]> * test: sep test for head/worker pod name Signed-off-by: machichima <[email protected]> * docs: add feature flag desc to helm chart Signed-off-by: machichima <[email protected]> * fix: feature flag name Signed-off-by: machichima <[email protected]> * refactor: simplify check if the feature flag is set Signed-off-by: machichima <[email protected]> --------- Signed-off-by: machichima <[email protected]>
… name with feature flag (ray-project#3873) * feat: use gen name for ray version > 2.48 Signed-off-by: machichima <[email protected]> * feat: head pod name to non deterministic Signed-off-by: machichima <[email protected]> * test: compare head pod name between old and new Signed-off-by: machichima <[email protected]> * feat: add feature flag for deterministic head pod name Signed-off-by: machichima <[email protected]> * feat: use deterministic if feature flag set to true Signed-off-by: machichima <[email protected]> * test: head pod name with/without feature flag == true Signed-off-by: machichima <[email protected]> * test: sep test for head/worker pod name Signed-off-by: machichima <[email protected]> * docs: add feature flag desc to helm chart Signed-off-by: machichima <[email protected]> * fix: feature flag name Signed-off-by: machichima <[email protected]> * refactor: simplify check if the feature flag is set Signed-off-by: machichima <[email protected]> --------- Signed-off-by: machichima <[email protected]>
Why are these changes needed?
KubeRay 1.4.0 uses a deterministic name for the head pod. This, unfortunately, breaks the Autoscaler v2 for Ray < 2.48 after a head node restart under GCS FT configuration.
Based on #3868 (comment), we shouldn't change the behavior in the operator based on the Ray version. Therefor, we decided to add a feature flag that can toggle between using deterministic or non-deterministic head pod name (default to use non-deterministic head pod name)
ENABLE_DETERMINISTIC_HEAD_POD_NAME
to set whether we want to use deterministic name for head podRelated issue number
Closes #3868
Checks