Skip to content

Add cleanup of 'Failed' jobs #115

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

saintstack
Copy link
Contributor

  • k8s/agent-scaler/agent-scaler.sh
    Add --ignore-not-found=true to delete. Cleans up some complaint when
    two scripts running beside each other and one deletes first (happens
    when testing changes to this script). Minor item.

Also, clean up 'Failed' jobs else they just hang out.

Here are example Failed jobs.


joshua-agent-250604164239-10             Failed     0/1           3h3m       3h3m
joshua-agent-250604164415-38             Failed     0/1           3h2m       3h2m
joshua-agent-250604164522-70             Failed     0/1           3h1m       3h1m

 Add --ignore-not-found=true to delete. Cleans up some complaint when
 two scripts running beside each other and one deletes first (happens
 when testing changes to this script). Minor item.

 Also, clean up 'Failed' jobs else they just hang out.

# cleanup explicitly Failed jobs
# Filter by AGENT_NAME and job status condition "Failed"="True"
for job in $(kubectl get jobs -n "${namespace}" -o jsonpath='{range .items[?(@.status.conditions[*].type=="Failed" && @.status.conditions[*].status=="True")]}{.metadata.name}{"\\n"}{end}' 2>/dev/null | { grep -E "^${AGENT_NAME}-[0-9]+(-[0-9]+)?$" || true; }); do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will get a bit tricky in bash, but I think we should delay the deletion of those failed jobs by 1 day to give some time for debugging (I'm not sure if we store the logs of the failed jobs somewhere).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants