CAP.return_cap_labels#
- CAP.return_cap_labels(subject_timeseries, runs=None, continuous_runs=False, shift_labels=False)[source]#
Return CAP Labels for Each Subject.
Uses the group-specific k-means models in
self.kmeansto assign each frames (TR) to CAPs for each subject inself.subject_table.The process involves the following steps:
Retrieve the timeseries for a specific subject’s run from
subject_timeseries.Determine their group assignment using
self.subject_tableand scale their timeseries data (ifstandardizewas set to True inself.get_caps()) using the means and standard deviation derived from the group specific concatenated dataframes (self.meansandself.stdev).Note
This scaling ensures the subject’s data matches the distribution of the input data used for group-specific clustering, which is needed for accurate predictions when using group-specific k-means models.
Use group-specific k-means model (
self.kmeans) and thepredict()function from scikit-learn’sKMeansto assign each frame (TR).If
shift_labelsis True, apply a one unit shift for the minimum label to start at “1” instead of “0”.Repeat 1-4 to the remaining runs (all if
runsis None or specific runs) for the subject.If
continuous_runsis True, then stack each numpy array horizontally to create a single array containing the predicted labels for a subject.Repeat 1-6 for the remaining subjects.
- Parameters:
subject_timeseries (
SubjectTimeseriesorstr) – A dictionary mapping subject IDs to their run IDs and their associated timeseries (TRs x ROIs) as a NumPy array. Can also be a path to a serialized file containing this same structure. Refer to documentation forSubjectTimeseriesin the “See Also” section for an example structure.runs (
int,str,list[int],list[str], orNone, default=None) – The run IDs to return CAP labels for (e.g.runs=[0, 1]orruns=["01", "02"]). If None, CAP labels will be returned for all detected run IDs even if only specific runs were used duringself.get_caps().continuous_runs (
bool, default=False) –If True, all runs will be treated as a single, uninterrupted run.
# CAP assignment of frames from for run_1 and run_2 run_1 = [0, 1, 1] run_2 = [2, 3, 3] # Computation of each CAP metric will be conducted on the combined vector continuous_runs = [0, 1, 1, 2, 3, 3]
Note
This parameter can be used together with
runsto filter the runs to combine.The run-ID for each subject in the dictionary will be converted to run-continuous to denote that runs were combined.
If only a single run available for a subject, the original run ID (as opposed to “run-continuous”) will be used.
shift_labels (
bool, default=False) –If True, shifts each label by up one unit for the minimum CAP label to start at “1” as opposed to “0” (scikit-learn’s default), if preferred.
predicted_labels = [0, 2, 5] # Add plus one shift predicted_labels = [1, 3, 6]
See also
neurocaps.typing.SubjectTimeseriesType definition for the subject timeseries dictionary structure. (See: SubjectTimeseries Documentation)
- Returns:
dict[str, dict[str, np.ndarray]] – Dictionary mapping each subject to their run IDs and a 1D numpy array containing the predicted CAP for each frame (TR).