Skip to content

Automated Curation - auto_label_units and export_to_phy #4195

@ASAP-Cord

Description

@ASAP-Cord

I have trained a model for automated curation, created a sorting analyzer for a session i want to curate and ran the auto_label_units function.

When setting export_to_phy = True within auto_label_units, it fails to export and runs into the following error:

labels = sc.auto_label_units(
    sorting_analyzer=sorting_analyzer,
    model_folder=model_path,
    export_to_phy=True,
    trusted=['numpy.dtype']
)

print(labels.head())
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_24256\272560651.py in ?()
----> 1 labels = sc.auto_label_units(
      2     sorting_analyzer=sorting_analyzer,
      3     model_folder=model_path,
      4     export_to_phy=True,

~\Miniforge\envs\cura\Lib\site-packages\spikeinterface\curation\model_based_curation.py in ?(sorting_analyzer, model_folder, model_name, repo_id, label_conversion, trust_model, trusted, export_to_phy, enforce_metric_params)
    273         raise ValueError("The model must be an instance of sklearn.pipeline.Pipeline")
    274 
    275     model_based_classification = ModelBasedClassification(sorting_analyzer, model)
    276 
--> 277     classified_units = model_based_classification.predict_labels(
    278         label_conversion=label_conversion,
    279         export_to_phy=export_to_phy,
    280         model_info=model_info,

~\Miniforge\envs\cura\Lib\site-packages\spikeinterface\curation\model_based_curation.py in ?(self, label_conversion, input_data, export_to_phy, model_info, enforce_metric_params)
    122         self.sorting_analyzer.sorting.set_property("classifier_label", predictions)
    123         self.sorting_analyzer.sorting.set_property("classifier_probability", probabilities)
    124 
    125         if export_to_phy:
--> 126             self._export_to_phy(classified_units)
    127 
    128         return classified_units

~\Miniforge\envs\cura\Lib\site-packages\spikeinterface\curation\model_based_curation.py in ?(self, classified_units)
    193 
    194         import pandas as pd
    195 
    196         # Create a new DataFrame with unit_id, prediction, and probability columns from dict {unit_id: (prediction, probability)}
--> 197         classified_df = pd.DataFrame.from_dict(classified_units, orient="index", columns=["prediction", "probability"])
    198 
    199         # Export to Phy format
    200         try:

~\Miniforge\envs\cura\Lib\site-packages\pandas\core\frame.py in ?(cls, data, orient, dtype, columns)
   1905         orient = orient.lower()  # type: ignore[assignment]
   1906         if orient == "index":
   1907             if len(data) > 0:
   1908                 # TODO speed up Series case
-> 1909                 if isinstance(next(iter(data.values())), (Series, dict)):
   1910                     data = _from_nested_dict(data)
   1911                 else:
   1912                     index = list(data.keys())

TypeError: 'numpy.ndarray' object is not callable

I have been trying to work around this issue, by letting auto_label_units run with export_to_phy=False (without the export it runs) and manually embedding the resulting labels in the sorting_analyzer to then export it to phy:

labels = sc.auto_label_units(
    sorting_analyzer=sorting_analyzer,
    model_folder=model_path,
    export_to_phy=False,
    trusted=['numpy.dtype']
)

labels.to_csv(Path(path_phy) / "auto_label.csv", index_label="unit_id")

unit_ids = sorting_analyzer.sorting.unit_ids

for unit_id, (_, row) in zip(unit_ids, labels.iterrows()):
   sorting_analyzer.sorting.set_property('auto_label', [row['prediction']], ids=[unit_id])
   sorting_analyzer.sorting.set_property('auto_label_confidence', [row['probability']], ids=[unit_id])

si.export_to_phy(
    sorting_analyzer=sorting_analyzer,
    output_folder=output_phy_path,
    remove_if_exists=True, copy_binary=False
)

Unfortunately, this seems to fail at embedding the labels in the sorting analyzer and merely exports the sorting analyzer without the labels (cluster_group.tsv remains unsorted).

I would appreciate any help on addressing the initial error or finding a different approach to exporting the curated/labeled sorting analyzer to phy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions