Skip to content

[cont] Clear "C" TClonesArray: Remove object from TProcessID-table ; Allow removing without destructing #18439

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

ferdymercury
Copy link
Collaborator

@ferdymercury ferdymercury commented Apr 17, 2025

This Pull request:

Changes or fixes:

Explores suggestion by @olifre in https://its.cern.ch/jira/browse/ROOT-7249 to see if it impacts CI tests
And also allow removing without destructing via new ClearSlot function https://its.cern.ch/jira/browse/ROOT-7473

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

Copy link

github-actions bot commented Apr 17, 2025

Test Results

    17 files      17 suites   3d 16h 10m 13s ⏱️
 2 745 tests  2 744 ✅ 0 💤  1 ❌
45 052 runs  45 035 ✅ 0 💤 17 ❌

For more details on these failures, see this check.

Results for commit 1348edd.

♻️ This comment has been updated with latest results.

@pcanal
Copy link
Member

pcanal commented Apr 17, 2025

This seems to investigate both https://its.cern.ch/jira/browse/ROOT-7249 and https://its.cern.ch/jira/browse/ROOT-7473. If so please update the description and the tickets accordingly. Thanks :)

@pcanal pcanal self-assigned this Apr 17, 2025
Copy link
Member

@pcanal pcanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will (of course :)) needs new tests checking that the new code does what we hope it does. Thanks.

@pcanal pcanal moved this to In progress in Fixed in 6.38.00 Apr 17, 2025
@ferdymercury
Copy link
Collaborator Author

This will (of course :)) needs new tests checking that the new code does what we hope it does. Thanks.

It's already breaking some tests, so not sure if the JIRA issue should be close as won't fix. Or be an opt-in via a bool argument.

@olifre
Copy link
Contributor

olifre commented Apr 17, 2025

It's already breaking some tests, so not sure if the JIRA issue should be close as won't fix. Or be an opt-in via a bool argument.

I did not do a very deep investigation of the test failures, but at first glance, it might be related to a missing implementation for the second part of the suggestion in https://its.cern.ch/jira/browse/ROOT-7249 , i.e. clearing TRef / TRefArray references to objects which have been removed (given that the Event class in the test case keeps TRefArray to the tracks which it exploded on in one of the tests).
Sadly, I did not come up with a clean implementation apart from triggering all TRef / TRefArray to re-check their reference targets at a point in time defined by the user (as it is a heavy operation) and adjust the UinqueID if they are gone.

@olifre
Copy link
Contributor

olifre commented Apr 17, 2025

And of course, thanks for coming back to these issue reports 👍 .

@ferdymercury ferdymercury changed the title [cont] Clear "C" TClonesArray: Remove object from TProcessID-table [cont] Clear "C" TClonesArray: Remove object from TProcessID-table ; Allow removing without destructing Apr 17, 2025
ferdymercury and others added 2 commits April 17, 2025 22:25
Co-authored-by: Philippe Canal <[email protected]>
… to be left to the user's discretion.


In addition we need to record that the content was updated.

Co-authored-by: Philippe Canal <[email protected]>
@ferdymercury
Copy link
Collaborator Author

Test failires are gone, thanks pcanal! Not sure either though how to cleanup the TRefArrays mentioned by olifre.
@olifre do you have maybe a minimal reproducer that we could transform into a Ctest for this? Thanks a lot for your patience.

@olifre
Copy link
Contributor

olifre commented Apr 18, 2025

@olifre do you have maybe a minimal reproducer that we could transform into a Ctest for this? Thanks a lot for your patience.

I collected together a testcase from 10 years ago, it's quite minimal, but still spread across different files it requires:

  • A data object class (class verison > 0 ) with a TRefArray inside which can be streamed out.
  • A small program streaming two TClonesArray out with such objects, referencing each other in some of the "events", and purging some of the referenced objects before writeout. Also, object ID needs to be reset each "event".
  • A small program reading in the file again, and checking the references against the expectation.

Sadly, my old code is not ready for copy and paste into a CTest, but here it is:
streamwithrefs.tar.gz
Using this with:

make && LD_LIBRARY_PATH=$(pwd) ./streamwithrefs && LD_LIBRARY_PATH=$(pwd) ./readwithrefs

will write out a small file test.root with TestObj objects inside, referencing each other, while the referenced object was purged in each odd event.
readwithrefs will read that back in, check the TRefArray size, and if that's larger than 0 and no referenced objects were read in, tries to access the referenced object, following a reference to an object from a wrong event.

I hope that helps — in the end, an assert when Wrong object referenced is triggered could become the ctest, but I'm not really experienced with CTests so I would not be sure how to express this multi-step test (one process to write and another to read, and also having a compiled data objects with dictionary) in a CTest.

@ferdymercury
Copy link
Collaborator Author

That's already a huge help thanks a lot

@ferdymercury ferdymercury requested a review from pcanal April 21, 2025 10:38
@ferdymercury ferdymercury marked this pull request as ready for review April 21, 2025 10:38
if (pruneSecondChildren && (e % 2 != 0)) {
children.Clear("C");
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olifre could you explain at this line how your "second workaround" was working? I guess that, before calling dataTree.Fill(), you were walking in a for-loop the motherPart.GetChild() checking whether the stored process ID still existed in the static TProcess table ?

Copy link
Contributor

@olifre olifre Apr 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ferdymercury Actually, since this was in an experiment-specific framework and I did not trust all users to do it right, I chose the following approach:

  • Have two classes inheriting from TRef and TRefArray.
  • Via an interface class (to prevent code duplication), these two classes had a hook when any reference was set, injecting themselves into a global static list of instances referencing objects.
  • Before writeout, this static list of referencing instances is walked and each instance is asked to check the references, i.e. in the case of the class inheriting form TRefArray:
void CBTRefArray::fMaybePruned() {
        for (Int_t i = 0; i < fSize; i++) {
                if (fUIDs[i]) {
                        /* Re-check reference-target, might have become NULL due to pruning -
                         * in that case, store that in our UniqueID.
                         */
                        if (At(i) == nullptr) {
                                fUIDs[i] = 0;
                        }
                }
        }
}

(and the same without loop for TRef).

So in the end, it was the loop you described, but actually the TRef classes themselves were checking their reference targets and in case these had become nullptr they adapted their internally stored UID before writeout.
I'm not sure this can realistically be put into the Streamers, as the reference target may have been written out and deleted from memory already, which is why I used a "user hook before writeout" to trigger this check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

A bit related issue: https://its.cern.ch/jira/browse/ROOT-6077

@ferdymercury ferdymercury added this to the 6.38.00 milestone Apr 22, 2025
@ferdymercury ferdymercury marked this pull request as draft April 24, 2025 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants