neurocaps.analysis.CAP.calculate_metrics#

CAP.calculate_metrics(subject_timeseries, tr=None, runs=None, continuous_runs=False, metrics=['temporal_fraction', 'persistence', 'counts', 'transition_frequency'], return_df=True, output_dir=None, prefix_filename=None, progress_bar=False)[source]#

Compute Participant-wise CAP Metrics.

Uses the k-means model (or group-specific k-means models if groups specified during initialization of the CAP class) to assign each subject's TRs to a CAP. Also, creates a single pandas.DataFrame per CAP metric for all participants (with the exception of "transition_probability" which creates a single dataframe per group). As described by Liu et al., 2018 and Yang et al., 2021. The metrics include:

  • "temporal_fraction": The proportion of total volumes spent in a single CAP over all volumes in a run.

    predicted_subject_timeseries = [1, 2, 1, 1, 1, 3]
    target = 1
    temporal_fraction = 4 / 6
    
  • "persistence": The average time spent in a single CAP before transitioning to another CAP (average consecutive/uninterrupted time).

    predicted_subject_timeseries = [1, 2, 1, 1, 1, 3]
    target = 1
    
    # Sequences for 1 are [1] and [1, 1, 1]; There are 2 contiguous sequences
    persistence = (1 + 3) / 2
    
    # Turns average frames into average time = 4
    tr = 2
    if tr: persistence = ((1 + 3) / 2) * 2
    
  • "counts": The total number of initiations of a specific CAP across an entire run. An initiation is defined as the first occurrence of a CAP. If the same CAP is maintained in contiguous segment (indicating stability), it is still counted as a single initiation.

    predicted_subject_timeseries = [1, 2, 1, 1, 1, 3]
    target = 1
    
    # Initiations of CAP-1 occur at indices 0 and 2
    counts = 2
    
  • "transition_frequency": The total number of transitions to different CAPs across the entire run.

    predicted_subject_timeseries = [1, 2, 1, 1, 1, 3]
    
    # Transitions between unique CAPs occur at indices 0 -> 1, 1 -> 2, and 4 -> 5
    transition_frequency = 3
    
  • "transition_probability": The probability of transitioning from one CAP to another CAP (or the same CAP). This is calculated as (Number of transitions from A to B)/ (Total transitions from A). Note that the transition probability from CAP-A -> CAP-B is not the same as CAP-B -> CAP-A.

        # Note last two numbers in the predicted timeseries are switched for this example
        predicted_subject_timeseries = [1, 2, 1, 1, 3, 1]
    
        # If three CAPs were identified in the analysis
        combinations = [(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)]
    
        # Represents transition from CAP-1 -> CAP-2
        target = (1, 2)
    
        # There are 4 ones in the timeseries but only three transitions from 1; 1 -> 2, 1 -> 1, 1 -> 3
        n_transitions_from_1 = 3
    
        # There is only one 1 -> 2 transition
        transition_probability = 1 / 3
    
    **Note**: In the supplementary material for Yang et al., the mathematical relationship between
    temporal fraction, counts, and persistence is ``temporal fraction = (persistence * counts)/total volumes``.
    If persistence has been converted into time units (seconds), then
    ``temporal fraction = (persistence * counts) / (total volumes * TR)``.
    
Parameters:
  • subject_timeseries (dict[str, dict[str, np.ndarray]] or os.PathLike) --

    A dictionary mapping subject IDs to their run IDs and their associated timeseries (TRs x ROIs) as a NumPy array. Can also be a path to a pickle file containing this same structure. The expected structure of is as follows:

    subject_timeseries = {
            "101": {
                "run-0": np.array([...]), # Shape: TRs x ROIs
                "run-1": np.array([...]), # Shape: TRs x ROIs
                "run-2": np.array([...]), # Shape: TRs x ROIs
            },
            "102": {
                "run-0": np.array([...]), # Shape: TRs x ROIs
                "run-1": np.array([...]), # Shape: TRs x ROIs
            }
        }
    

  • tr (float or None, default=None) -- The repetition time (TR) in seconds. If provided, persistence will be calculated as the average uninterrupted time, in seconds, spent in each CAP. If not provided, persistence will be calculated as the average uninterrupted volumes (TRs), in TR units, spent in each state.

  • runs (int, str, list[int], list[str], or None, default=None) -- The run numbers to calculate CAP metrics for (e.g. runs=[0, 1] or runs=["01", "02"]). If None, CAP metrics will be calculated for each run.

  • continuous_runs (bool, default=False) --

    If True, all runs will be treated as a single, uninterrupted run.

    # CAP assignment of frames from for run_1 and run_2
    run_1 = [0, 1, 1]
    run_2 = [2, 3, 3]
    
    # Computation of each CAP metric will be conducted on the combined vector
    continuous_runs = [0, 1, 1, 2, 3, 3]
    

  • metrics ({"temporal_fraction", "persistence", "counts", "transition_frequency", "transition_probability"} or list["temporal_fraction", "persistence", "counts", "transition_frequency", "transition_probability"], default=["temporal_fraction", "persistence", "counts", "transition_frequency"]) -- The metrics to calculate. Available options include "temporal_fraction", "persistence", "counts", "transition_frequency", and "transition_probability".

  • return_df (str, default=True) -- If True, returns pandas.DataFrame inside a dictionary`, mapping each dataframe to their metric.

  • output_dir (os.PathLike or None, default=None) -- Directory to save pandas.DataFrame as csv files. The directory will be created if it does not exist. Dataframes will not be saved if None.

  • prefix_filename (str or None, default=None) -- A prefix to append to the saved file names for each pandas.DataFrame, if output_dir is provided.

  • progress_bar (bool, default=False) --

    If True, displays a progress bar.

    Added in version 0.21.5.

Returns:

dict[str, pd.DataFrame] or dict[str, dict[str, pd.DataFrame]] -- Dictionary containing pandas.DataFrame - one for each requested metric. In the case of "transition_probability", each group has a separate dataframe which is returned in the from of dict[str, dict[str, pd.DataFrame]].

Note

Scaling: If standardizing was requested in self.get_caps(), then the columns/ROIs of the subject_timeseries provided to this method will be scaled using the group-specific mean and sample standard deviation derived from the concatenated data.

Group-Specific CAPs: When the groups parameter is used during initialization of the CAP class, self.get_caps() computes separate k-means model for each group. This means that each group has its own specific k-means model that is used for CAP metric calculations. The inclusion of all groups within the same dataframe (for "temporal_fraction", "persistence", "counts", and "transition_frequency") is primarily to reduce the number of dataframes generated. Hence, each CAP (e.g., "CAP-1") is specific to its respective groups. For instance, "CAP-1" under Group A is distinct from "CAP-1" under Group B.

For instance, if their are two groups, Group A and Group B, each with their own CAPs:

  • A has 2 CAPs: "CAP-1" and "CAP-2"

  • B has 3 CAPs: "CAP-1", "CAP-2", and "CAP-3"

The resulting "temporal_fraction" dataframe ("persistence" and "counts" have a similar structure but "transition frequency" will only contain the "Subject_ID", "Group", and "Run" columns in addition to a "Transition_Frequency" column):

With Groups

Subject_ID

Group

Run

CAP-1

CAP-2

CAP-3

101

A

run-1

0.40

0.60

NaN

102

B

run-1

0.30

0.50

0.20

...

...

...

...

...

...

The "NaN" indicates that "CAP-3" is not applicable for Group A. Additionally, "NaN" will only be observed in instances when two or more groups are specified and have different number of CAPs. As mentioned previously, "CAP-1", "CAP-2", and "CAP-3" for Group A is distinct from Group B due to using separate k-means models.

If no groups were specified during initialization of the CAP class, the resulting "temporal_fraction" dataframe (assuming four CAPs were identified in the k-means model using all participants):

Without Groups

Subject_ID

Group

Run

CAP-1

CAP-2

CAP-3

CAP-4

101

All Subjects

run-1

0.20

0

0

0.80

102

All Subjects

run-1

0.50

0.25

0.25

0

...

...

...

...

...

...

...

Transition Probability: For "transition_probability", each group has a separate dataframe to containing the CAP transitions for each group.

  • Group A Transition Probability: Stored in df_dict["transition_probability"]["A"]

  • Group B Transition Probability: Stored in df_dict["transition_probability"]["B"]

The resulting `"transition_probability"` for Group A:

Subject_ID

Group

Run

1.1

1.2

1.3

2.1

...

101

A

run-1

0.40

0.60

0

0.2

...

...

...

...

...

...

...

...

...

The resulting `"transition_probability"` for Group B:

Subject_ID

Group

Run

1.1

1.2

2.1

...

102

B

run-1

0.70

0.30

0.10

...

...

...

...

...

...

...

...

Here the columns indicate {from}.{to}. For instance, column 1.2 indicates the probability of transitioning from CAP-1 to CAP-2.

If no groups are specified, then the dataframe is stored in df_dict["transition_probability"]["All Subjects"].

For the "Group" column, whitespace in group names no longer replaced with underscores in versions >=0.17.11.

References

Liu, X., Zhang, N., Chang, C., & Duyn, J. H. (2018). Co-activation patterns in resting-state fMRI signals. NeuroImage, 180, 485–494. https://doi.org/10.1016/j.neuroimage.2018.01.041

Yang, H., Zhang, H., Di, X., Wang, S., Meng, C., Tian, L., & Biswal, B. (2021). Reproducible coactivation patterns of functional brain networks reveal the aberrant dynamic state transition in schizophrenia. NeuroImage, 237, 118193. https://doi.org/10.1016/j.neuroimage.2021.118193