Skip to content

Add splitting functionality to curation and SortingAnalyzer #3817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Jul 21, 2025

Conversation

alejoe91
Copy link
Member

Depends on #3760

Implements splitting functionality in:

@alejoe91 alejoe91 added the curation Related to curation module label Mar 28, 2025
Comment on lines 597 to 599
old_template = arr[self.sorting_analyzer.sorting.ids_to_indices([split_unit_id])[0], ...]
new_indices = np.array([new_unit_ids.index(unit_id) for unit_id in new_splits])
new_array[new_indices, ...] = np.tile(old_template, (len(new_splits), 1, 1))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelgarcia this needs to be discussed. What should we do if the waveforms extension is not there? The current behavior is copying, but we mught want to force recompute here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add warning here!

@samuelgarcia
Copy link
Member

This is amazing camarade.

I would do first a PR to focus only on the format. And then merge this one.
What do you think ?

@alejoe91
Copy link
Member Author

I did camarade! #3760

@alejoe91 alejoe91 added this to the 0.103.0 milestone Jun 11, 2025
Comment on lines +71 to +74
full_spike_indices = []
for label in np.unique(self.labels):
label_indices = np.where(self.labels == label)[0]
full_spike_indices.append(label_indices)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe make a helper function that returns a dict of indices given an array

# we propagate random spikes to avoid random spikes to be recomputed
extension_dict_ = extension_dict_split.copy()
extension_dict_.pop("random_spikes")
analyzer_hard.extensions["random_spikes"] = analyzer_split.extensions["random_spikes"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do a copy and ref to the analyzer here



### SPLITTING ZONE ###
def apply_splits_to_sorting(sorting, unit_splits, new_unit_ids=None, return_extra=False, new_id_strategy="append"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a small doc no ?
You are becoming more lazy than me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

new_unit_ids.append([str(m + i) for i in range(num_splits)])
else:
# we cannot automatically find new names
new_unit_ids.append([f"split{i}" for i in range(num_splits)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could put also the old unit id... not sure

else:
# dtype int
new_unit_ids.append(list(max(old_unit_ids) + 1 + np.arange(num_splits, dtype=dtype)))
old_unit_ids = np.concatenate([old_unit_ids, new_unit_ids[-1]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using this variable name for the growing new unit id set is a bo confusing at the first read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the [-1] is not obvious

split_2 = split_2[~np.isin(split_2, split_indices)]
new_split_indices = [split_indices, split_2]
else:
new_split_indices = split_indices
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check ompleness here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

salut camarade. I just made some changes and check for completeness in the apply_splits_to_sorting and SortingAnalyzer.split_units.

This forces the user to provide complete spike indices, which is done automatically though the curation module and apply_curation function

keep_mask=keep_mask,
verbose=verbose,
**job_kwargs,
)
elif merging_mode == "hard":
recompute_dict[extension_name] = extension.params
else:
# split
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try is not agood idea here for this new mechanism, we must capture any new bug!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed and added splitting mode

assert analyzer_curated.sorting.get_property("pyramidal", ids=[unit_id])[0]


def test_apply_curation_with_split_multi_segment():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make only a multi segment tests every where to avoid mono/multi segment tests

{
"unit_id": 2,
"mode": "labels",
"labels": split_labels.tolist(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we tests also the other representation by indices ?

@samuelgarcia samuelgarcia merged commit 7c91082 into SpikeInterface:main Jul 21, 2025
15 checks passed
@samuelgarcia
Copy link
Member

yeah!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
curation Related to curation module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants