Skip to content

Create fetch_job_status.py #833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 42 commits into
base: main
Choose a base branch
from
Draft

Conversation

araravik-psd
Copy link

@araravik-psd araravik-psd commented Jun 13, 2025

Creating a fetch_job_status.py python script to obtain Github actions job status output in the Workflow.

Adding a python helper script to be used as part of the Github actions worfklow to fetch job status for different jobs run as part of the workflow.

Sample Workflow changes to be added

- name: Determine job status for workflow run if: '!cancelled()' env: run_id: ${{ github.RUN_ID }} attempt: ${{ github.ATTEMPT }} run: python build_tools/github_actions/fetch_job_status.py

Creating a fetch_job_status.py python script to obtain Github actions job status output in the Workflow.
Copy link
Contributor

@geomin12 geomin12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of questions in comments:

  1. Will this be included in any workflow? This feels somewhat incomplete as it isn't tied in anywhere


def run():
# TODO: Remove hardcoding for ATTEMPT once github variables fetch is fixed
github_release_url = f"https://api.github.com/repos/RoCm/TheRock/actions/runs/{RUN_ID}/attempts/1/jobs"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this be used after the job is completed and getting the github status? or will this be a cronjob?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be used as part of the workflow and be added in after the job is completed to get the status of different steps in each job

This is the example of the workflow step

- name: Determine job status for workflow run if: '!cancelled()' env: run_id: ${{ github.RUN_ID}} attempt: ${{ github.ATTEMPT }} run: python build_tools/github_actions/fetch_job_status.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Should we include that change now? And make sure this works properly?


def run():
# TODO: Remove hardcoding for ATTEMPT once github variables fetch is fixed
github_release_url = f"https://api.github.com/repos/RoCm/TheRock/actions/runs/{RUN_ID}/attempts/1/jobs"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update variable github_release_url to be more relevant :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure will update the variable to "githib_run_url".

)

job_data = json.loads(response.read().decode("utf-8"))
if job_data["jobs"].keys() >= 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be checking len(job_data["jobs"].keys())?

I am fairly certain job_data["jobs"].keys() will result into a list of strings, and comparing it to >= won't be valid and will error in type issues

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Geo, this is a typo, wanted to to use len and missed it out at the end. Let me make the change and resubmit.

if len(job_data['jobs']).keys()>0


job_data = json.loads(response.read().decode("utf-8"))
if job_data["jobs"].keys() >= 0:
# Determine is number of jobs run in the workflow is atleast 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we clarify this comment? reading it is a bit difficult to understand

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am checking if the workflow's Github API status output shows atleast 1 job as run and getting populated in API the output.

# Determine if number of jobs run in the workflow is atleast 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, can you correct that? I see "is", it should be "if", and can you specify what kind of jobs?

job_data = json.loads(response.read().decode("utf-8"))
if job_data["jobs"].keys() >= 0:
# Determine is number of jobs run in the workflow is atleast 1
set_github_output({"append": json.dumps(job_data)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_github_output sets an output variable for other jobs to use. will this data used in other places?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am planning to use the set_github_output variable in the next step of the workflow file where I write it to a Json file in S3, which will then be used by the Redshift database service to build a dashboard with the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write down that plan somewhere (e.g. a GitHub issue)? I'd like to see what pieces you plan on building to make reviewing them easier.

If you are working within the context of a running workflow, there are ways to get job status without making an API request, like this: https://github.com/iree-org/iree/blob/main/.github/workflows/workflow_summary.yml (see iree-org/iree#19444 for context there)

@geomin12
Copy link
Contributor

also pls fix pre-commit :)

then pre-commit run --all-files

job_data = json.loads(response.read().decode("utf-8"))
if job_data["jobs"].keys() >= 0:
# Determine is number of jobs run in the workflow is atleast 1
set_github_output({"append": json.dumps(job_data)})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also noticed this on the call, but append is vague. can we call it a better variable name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of questions in comments:

  1. Will this be included in any workflow? This feels somewhat incomplete as it isn't tied in anywhere

For the file path, this script is simlar: https://github.com/ROCm/TheRock/blob/main/build_tools/hack/delete_workflow.sh

Copying one of my comments discussing that:

That sounds similar to some other utility scripts I've used like https://github.com/iree-org/iree/blob/main/build_tools/scripts/git/prune_old_tags.sh .

Could go in https://github.com/ROCm/TheRock/tree/main/build_tools/github_actions , or a similar subfolder under build_tools/ . It be -1 on a top level hack/ folder, but top level scripts/ would be okay, though there would be some overlap with build_tools/

Once we have more similar files we can decide where best to put them. Hard to commit with just a few.

- ATTEMPT
"""

from configure_ci import set_github_output
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should pull set_github_output into a utils file instead of sharing across files this way.

Comment on lines +32 to +39
if response.status == 403:
raise Exception(
f"Error when retrieving GitHub response. This is most likely a rate limiting issue, so please try again"
)
elif response.status != 200:
raise Exception(
f"Error when retrieving GitHub response assets for {RUN_ID} tag with status code {response.status}. Exiting..."
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any more information in the response that could be logged here? Could also raise more specific Exception subclasses.

job_data = json.loads(response.read().decode("utf-8"))
if job_data["jobs"].keys() >= 0:
# Determine is number of jobs run in the workflow is atleast 1
set_github_output({"append": json.dumps(job_data)})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write down that plan somewhere (e.g. a GitHub issue)? I'd like to see what pieces you plan on building to make reviewing them easier.

If you are working within the context of a running workflow, there are ways to get job status without making an API request, like this: https://github.com/iree-org/iree/blob/main/.github/workflows/workflow_summary.yml (see iree-org/iree#19444 for context there)


def run():
# TODO: Remove hardcoding for ATTEMPT once github variables fetch is fixed
github_workflow_jobs_url = f"https://api.github.com/repos/RoCm/TheRock/actions/runs/{RUN_ID}/attempts/1/jobs"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: caps

Suggested change
github_workflow_jobs_url = f"https://api.github.com/repos/RoCm/TheRock/actions/runs/{RUN_ID}/attempts/1/jobs"
github_workflow_jobs_url = f"https://api.github.com/repos/ROCm/TheRock/actions/runs/{RUN_ID}/attempts/1/jobs"

Comment on lines +15 to +16
RUN_ID = os.getenv("RUN_ID")
ATTEMPT = os.getenv("ATTEMPT")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relating to my other comment about seeing the whole plan before reviewing individual scripts, if this script runs outside of the workflow in question, I would expect arguments to be passed through the CLI with argparse. Since these are environment variables that GitHub provides, I'm wondering if you even need to make an API request to get the job results.

@ScottTodd
Copy link
Member

ScottTodd commented Jun 16, 2025

Lots of commit activity here that looks unrelated to the code review. Can you convert to a draft PR and/or switch your debugging to another branch and use workflow_dispatch instead of pull_request triggers to test workflows?

@ScottTodd
Copy link
Member

Lots of commit activity here that looks unrelated to the code review. Can you convert to a draft PR and/or switch your debugging to another branch and use workflow_dispatch instead of pull_request triggers to test workflows?

Bump. @araravik-psd every time you push a commit, it sends a github notification, posting on Discord and sending emails to repository watchers. If these commits are not relevant for the code review, please batch them or push them to a different branch.

@araravik-psd araravik-psd marked this pull request as draft June 18, 2025 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: TODO
Development

Successfully merging this pull request may close these issues.

3 participants