Skip to content

DM-50550: Add exception diagnostics table to quantum provenance #495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

enourbakhsh
Copy link
Contributor

@enourbakhsh enourbakhsh commented May 8, 2025

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes

Copy link

codecov bot commented May 8, 2025

Codecov Report

Attention: Patch coverage is 9.37500% with 87 lines in your changes missing coverage. Please review.

Project coverage is 82.21%. Comparing base (924f491) to head (adb8ea2).

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
python/lsst/pipe/base/quantum_provenance_graph.py 9.37% 87 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #495      +/-   ##
==========================================
- Coverage   82.68%   82.21%   -0.47%     
==========================================
  Files         111      111              
  Lines       14334    14428      +94     
  Branches     1834     1862      +28     
==========================================
+ Hits        11852    11862      +10     
- Misses       2032     2117      +85     
+ Partials      450      449       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@enourbakhsh enourbakhsh force-pushed the tickets/DM-50550 branch 16 times, most recently from a513b9c to 237da91 Compare May 11, 2025 22:40
# lookup later. Querying per data ID in the loop is painfully slow.
if butler:
exposure_record_lookup = {
d.dataId["exposure"]: d for d in butler.query_dimension_records("exposure", explain=False)
Copy link
Contributor Author

@enourbakhsh enourbakhsh May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I could limit butler.query_dimension_records to just the data IDs with exceptions in task_summary.exceptions, instead of pulling everything. But the butler.query_dimension_records method only takes one dataId if passed, and I don't think it's possible to use where to filter those sorts of things. But it works and is fast enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @timj knows?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works fine for me if you already know the exposure IDs:

>>> butler.query_dimension_records("exposure", where="exposure in (:exp)", bind={"exp": (903334, 903336)}, instrument="HSC")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use

with butler.query() as query:
    query = query.join_data_coordinates(<list of exposure data IDs>)
    ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, change .required to .mapping in this line:

self.caveats.setdefault(code, []).append(dict(info["data_id"].required))

and then the data IDs you're starting with should have all of those implied values. They're already in the data IDs in the QG, but we were dropping them on the floor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will give it a try!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@enourbakhsh enourbakhsh force-pushed the tickets/DM-50550 branch 2 times, most recently from 39d0914 to 0b5f032 Compare May 13, 2025 20:38
# output table, if available. Note that 'band', 'day_obs', and
# 'physical_filter' already exist in `exception.data_id` below.
needed_visit_records = ["exposure_time", "target_name"]
needed_exposure_records = ["exposure_time", "target_name"]
Copy link
Member

@TallJimbo TallJimbo May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These sorts of strings should be passed in by the caller, not included in the code; I'm imagining something like an argument that's a list of extra quantities to put in the table.

You might then be able to use:

https://github.com/lsst/daf_butler/blob/07d9adb98d9bf763df84e9f7db83d4cfbb6c53d7/python/lsst/daf/butler/queries/_identifiers.py#L80

If you can initialize that with some dimensions (i.e. from the data IDs that have a particular exception), you can give it strings like "exposure_time" or "target_name" and it will figure out which dimension they belong to (of the ones it has been given, since those kinds of columns aren't always unique).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We modified the code so that here it calls an entry point that returns a dict of dimensions with list values of the metadata fields to be included.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enourbakhsh I don't see the new code we worked on here. Did you forget to push it? It was a big change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timj by the time I had it polished, @cmsaunders was already in the middle of the first review round. I didn’t want to interfere, so I’m planning to push those changes in the second round.

@@ -655,7 +660,7 @@ def _add_quantum_info(
self.recovered_quanta.append(dict(info["data_id"].required))
if final_quantum_run is not None and final_quantum_run.caveats:
code = final_quantum_run.caveats.concise()
self.caveats.setdefault(code, []).append(dict(info["data_id"].required))
self.caveats.setdefault(code, []).append(dict(info["data_id"].mapping))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you need to change this?

butler : `lsst.daf.butler.Butler`, optional
The butler used to create this summary. This is only used to get
exposure dimension records for the exception diagnostics.
return_exception_diagnostics_table : `bool`, optional

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any advantage to having this option. If show_exception_diagnostics is true, then you can return this, and it can just be ignored if it's not needed. Then you can avoid some of the if/elses and the ValueError below.

-------
table : `astropy.table.Table`
Table with one row per data ID and columns for exception types (by
task), and optionally, exposure dimension records and exception

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more clear here to write "columns for exception type and, optionally, dimension records and exception message."

# Define a hashable and stable tuple of data ID values.
key = tuple(sorted(data_id.mapping.items()))
assert len(rows[key]) == 0, f"Multiple exceptions for one data ID: {key}"
assert rows[key]["Exception"] == "", f"Duplicate entry for data ID {key} in {task_label}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how the previous line could pass and then this line fail. If len(rows[key]) ==0, doesn't that mean that nothing has been set for rows[key]["Exception"]?

row.update(dict(key))
# Add dimension records next, if requested and available.
if add_dimension_records:
if visit_data_ids:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to check whether the data_id for this row is in visit_data_ids (or in exposure_data_ids at line 1289)?

@cmsaunders
Copy link

This all looks reasonable, but I am concerned about how it runs on qgraphs that are not per-detector. I ran it on a qgraph with tract-level tasks and didn't get the Task-Exception-Count table, or the new exception diagnostics table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants