neurocaps.analysis.CAP#

class CAP(parcel_approach=None, groups=None)[source]#

Co-Activation Patterns (CAPs).

Performs k-means clustering for CAP identification, computes various temporal dynamics metrics (including counts, temporal fraction, persistence, transition frequency, and transition probability), provides multiple visualizations (such as heatmaps, outer products, correlation matrices, and cosine similarity radar plots that shows the network correspondence to both positive and negative CAP activations), and enables conversion of CAPs to NIfTI statistical maps.

Parameters:
  • parcel_approach (dict, or os.PathLike, default=None) --

    Specifies the parcellation approach for segmenting NifTI images into distinct regions-of-interests (ROIs). Supported parcellation aproaches includes "Schaefer", "AAL", and "Custom" (user-defined).

    The parcel_approach requires a nested dictionary with:

    • First Level Key: The parcellation name (e.g., "Schaefer", "AAL", "Custom")

    • Second Level Keys (sub-keys):

      • "maps": Mapping of regions to coordinates.

      • "nodes": Node definitions for each region.

      • "regions": List of anatomical region names (see "Custom Parcellations" in Notes section for detailed structure requirements).

    There are 3 valid input options for this parameter:

    • Use the initialization parameters from TimeseriesExtractor (e.g., n_rois, yeo_networks, resolution_mm for "Schaefer", or version for "AAL") to generate the above structure.

    • Provide a nested dictionary with the required structure (first level key of the parcellation name with sub-keys - "maps", "nodes", and "regions").

    • Load configuration from a pickle file containing the valid nested dictionary.

    Note: This parameter is required for several visualization functions in this class but can be set later using the parcel_approach property.

  • groups (dict[str, list[str]] or None, default=None) --

    A mapping of group names to unique subject IDs. If specified, then separate analyses are performed on groups (i.e., group-specific k-means models, group-specific visualization, etc). If None, then the analyses will be performed on all subjects and the default group name is "All Subjects". The structure should be as follows:

    {
        "GroupName1": ["1", "2", "3"],
        "GroupName2": ["4", "5", "6"],
    }
    

Properties#

parcel_approach: dict

A dictionary containing information about the parcellation. Can also be used as a setter, which accepts a dictionary or a dictionary saved as a pickle file. The structure is as follows:

# Structure of Schaefer
{
    "Schaefer":
    {
        "maps": "path/to/parcellation.nii.gz",
        "nodes": ["LH_Vis1", "LH_SomSot1", "RH_Vis1", "RH_SomSot1"],
        "regions": ["Vis", "SomSot"]
    }
}

# Structure of AAL
{
    "AAL":
    {
        "maps": "path/to/parcellation.nii.gz",
        "nodes": ["Precentral_L", "Precentral_R", "Frontal_Sup_L", "Frontal_Sup_R"],
        "regions": ["Precentral", "Frontal"]
    }
}

Refer to the example for "Custom" in the Note section below for the expected structure.

groups: dict[str, list[str]] or None:

A mapping of groups names to unique subject IDs.

subject_table: dict[str, str] or None

A dictionary generated when self.get_caps() is used. Operates as a lookup table that pairs each subject ID with the associated group. Also can be used as a setter. The structure is as follows.

{
    "Subject-ID": "GroupName",
    "Subject-ID": "GroupName",
}
n_clusters: int, list[int], or None

An integer or list of integers (if cluster_selection_method is not None) that was used for sklearn.cluster.KMeans in self.get_caps().

cluster_selection_method: str or None:

The cluster selection method used in self.get_caps() to identify the optimal number of clusters.

n_cores: int or None

Number of cores specified in self.get_caps() to use for multiprocessing with Joblib.

runs: int, list[Union[int, str]], or None

The run IDs specified in self.get_caps().

standardize: bool or None

A boolean denoting whether the features of the concatenated timeseries data are standardized if standardization was requested in self.get_caps().

means: dict[str, np.array] or None

A dictionary mapping groups to their associated NumPy array containing the means of each feature (ROI) if standardize is True in self.get_caps(). The structure is as follows:

{
    "GroupName": np.array([...]), # Shape: 1 x ROIs
}
stdev: dict[str, np.array] or None

A dictionary mapping groups to their associated NumPy array containing the sample standard deviation of each feature (ROI) if standardize is True in self.get_caps(). The structure is as follows:

{
    "GroupName": np.array([...]), # Shape: 1 x ROIs
}
concatenated_timeseries: dict[str, np.array] or None

A dictionary mapping each group to their associated concatenated NumPy array [(participants x TRs) x ROIs] when self.get_caps() is used. Note, if there are memory issues, del self.concatenated_timeseries can be used to delete property and have it only return None. The structure is as follows:

{
    "GroupName": np.array([...]), # Shape: (participants x TRs) x ROIs
}
kmeans: dict[str, sklearn.cluster.KMeans] or None

A dictionary mapping group-specific sklearn.cluster.KMeans model to each group when self.get_caps() is used. If cluster_selection_method is used, the model stored in this property is the optimal k-means model. The structure is as follows:

{
    "GroupName": sklearn.cluster.KMeans,
}
caps: dict[str, dict[str, np.array]] or None

A dictionary mapping the cluster centroids, extracted from the k-means model, to each group after self.get_caps is used. The structure is as follows:

{
    "GroupName": {
        "CAP-1": np.array([...]), # Shape: 1 x ROIs
        "CAP-2": np.array([...]), # Shape: 1 x ROIs
    }

}
cluster_scores: dict[str, Union[str, dict[str, float]]] or None

A dictionary mapping groups to the assessed cluster sizes and corresponding score of a specific cluster_selection_method available in self.get_caps(). The structure is as follows:

{
    "Cluster_Selection_Method": str,  # e.g., "elbow", "davies_bouldin", "silhouette", or "variance_ratio"
    "Scores": {
        "GroupName": {
            2: float,  # Score for 2 clusters
            3: float,  # Score for 3 clusters
            4: float,  # Score for 4 clusters
        },
    }
}
optimal_n_clusters: dict[str, int] or None

A dictionary mapping groups to their optimal cluster sizes if cluster_selection_method is not None in self.get_caps(). The structure is as follows:

{
    "GroupName": int,
}
variance_explained: dict[str, float] or None

A dictionary mapping groups to a float representing the total variance explained by their respective model stored in self.kmeans. The structure is as follows:

{
    "GroupName": float,
}
region_caps: dict[str, dict[str, np.array]] or None

A dictionary mapping group to their CAPs and corresponding NumPy array (1 x regions) containing the averaged value of each region or network if visual_scope set to "regions" in self.caps2plot(). The position of elements corresponds to "regions" in parcel_approach. The structure is as follows:

{
    "GroupName": {
        "CAP-1": np.array([...]), # Shape: 1 x regions
        "CAP-2": np.array([...]), # Shape: 1 x regions
    }

}
outer_products: dict[str, dict[str, np.array]] or None

A dictionary mapping group to their CAPs and corresponding NumPy array (ROIs x ROIs) containing the outer product if plot_options set to "outer_product" self.caps2plot(). The structure is as follows:

{
    "GroupName": {
        "CAP-1": np.array([...]), # Shape: ROIs x ROIs
        "CAP-2": np.array([...]), # Shape: ROIs x ROIs
    }

}
cosine_similarity: :obj: dict or None

A dictionary mapping each group to their CAPs and their associated "High Amplitude" and "Low Amplitude" list containing the cosine similarities. Each group contains a "Regions" key, consisting of a list of regions or networks. The position of the cosine similarities in the "High Amplitude" and "Low Amplitude" lists, corresponds to the region in the "Regions" list (Cosine similarity value in 0th index belongs to region in 0th index while the value in the 10th index belongs to the region in the 10th index).

{
    "GroupName": {
        "Regions": [...],
        "CAP-1": {
            "High Amplitude": [...], # Shape: 1 x Regions
            "Low Amplitude": [...]   # Shape: 1 x Regions
        }
        "CAP-2": {
            "High Amplitude": [...], # Shape: 1 x Regions
            "Low Amplitude": [...]   # Shape: 1 x Regions
        }
    }
}

Note

Default Group Name: If no groups were specified, the default group name will always be "All Subjects" to denote that the data of all subjects in the analysis were used to derive the CAPs. This is done to retain the same nesting structure for each property regardless if groups are specified.

Custom Parcellations: If using a "Custom" parcellation approach, ensure that the parcellation is lateralized (where each region/network has nodes in the left and right hemisphere). This is due to certain visualization functions assuming that each region consists of left and right hemisphere nodes. Additionally, certain visualization functions in this class also assume that the background label is 0. Therefore, do not add a background label in the "nodes" or "regions" keys.

The recognized sub-keys for the "Custom" parcellation approach includes:

  • "maps": Directory path containing the parcellation in a supported format (e.g., .nii or .nii.gz for NifTI).

  • "nodes": A list or numpy array of all node labels arranged in ascending order based on their numerical IDs from the parcellation. The 0th index should contain the label corresponding to the lowest, non-background numerical ID.

  • "regions": A dictionary defining major brain regions or networks, with each region containing "lh" (left hemisphere) and "rh" (right hemisphere) sub-keys listing node indices.

Refer to the neurocaps Parcellation Documentation for more detailed explanations and example structures for the "nodes" and "regions" sub-keys.

Note: Different sub-keys are required depending on the function used. Refer to the Note section under each function for information regarding the sub-keys required for that specific function.

Methods

calculate_metrics(subject_timeseries[, tr, ...])

Compute Participant-wise CAP Metrics.

caps2corr([output_dir, suffix_title, ...])

Generate Pearson Correlation Matrix for CAPs.

caps2niftis(output_dir[, suffix_filename, ...])

Standalone Method to Convert CAPs to NifTI Statistical Maps.

caps2plot([output_dir, suffix_title, ...])

Generate Heatmaps and Outer Product Plots for CAPs.

caps2radar([output_dir, suffix_title, ...])

Generate Radar Plots for CAPs using Cosine Similarity.

caps2surf([output_dir, suffix_title, ...])

Project CAPs onto Surface Plots.

get_caps(subject_timeseries[, runs, ...])

Perform K-Means Clustering to Identify CAPs.

get_caps(subject_timeseries, runs=None, n_clusters=5, cluster_selection_method=None, random_state=None, init='k-means++', n_init='auto', max_iter=300, tol=0.0001, algorithm='lloyd', standardize=True, n_cores=None, show_figs=False, output_dir=None, progress_bar=False, **kwargs)[source]#

Perform K-Means Clustering to Identify CAPs.

Concatenates the timeseries of each subject into a single NumPy array with dimensions (participants x TRs) x ROI and uses sklearn.cluster.KMeans on the concatenated data. Note, KMeans uses Euclidean distance. Additionally, the Elbow method is determined using KneeLocator from the kneed package and the Davies Bouldin, Silhouette, and Variance Ratio methods are calculated using scikit-learn's davies_bouldin_score, silhouette_score, and calinski_harabasz_score functions, respectively. Note, if groups were given when the CAP class was initialized, separate KMeans models and plots will be generated for all groups.

Parameters:
  • subject_timeseries (dict[str, dict[str, np.ndarray]] or os.PathLike) --

    A dictionary mapping subject IDs to their run IDs and their associated timeseries (TRs x ROIs) as a NumPy array. Can also be a path to a pickle file containing this same structure. The expected structure of is as follows:

    subject_timeseries = {
            "101": {
                "run-0": np.array([...]), # Shape: TRs x ROIs
                "run-1": np.array([...]), # Shape: TRs x ROIs
                "run-2": np.array([...]), # Shape: TRs x ROIs
            },
            "102": {
                "run-0": np.array([...]), # Shape: TRs x ROIs
                "run-1": np.array([...]), # Shape: TRs x ROIs
            }
        }
    

  • runs (int, str, list[int], list[str], or None, default=None) -- The run numbers to perform the CAPs analysis with (e.g. runs=[0, 1] or runs=["01", "02"]). If None, all runs in the subject timeseries will be concatenated into a single dataframe and subjected to k-means clustering.

  • n_clusters (Union[int, list[int]], default=5) -- The number of clusters to use for sklearn.cluster.KMeans. Can be a single integer or a list of integers (if cluster_selection_method is not None).

  • cluster_selection_method ({"elbow", "davies_bouldin", "silhouette", "variance_ratio"} or None, default=None) -- Method to find the optimal number of clusters. Options are "elbow", "davies_bouldin", "silhouette", and "variance_ratio".

  • random_state (int or None, default=None) -- The random state to use for sklearn.cluster.KMeans. Ensures reproducible results.

  • init ({"k-means++", "random"}, Callable, or ArrayLike, default="k-means++") -- Method for choosing initial cluster centroid for sklearn.cluster.KMeans. Options are "k-means++", "random", or callable or array-like of shape (n_clusters, n_features).

  • n_init ({"auto"} or int, default="auto") -- Number of times sklearn.cluster.KMeans is ran with different initial clusters. The model with lowest inertia from these runs will be selected.

  • max_iter (int, default=300) -- Maximum number of iterations for a single run of sklearn.cluster.KMeans.

  • tol (float, default=1e-4,) -- Stopping criterion for sklearn.cluster.KMeans``if the change in inertia is below this value, assuming ``max_iter has not been reached.

  • algorithm ({"lloyd", "elkan"}, default="lloyd") -- The type of algorithm to use for sklearn.cluster.KMeans. Options are "lloyd" and "elkan".

  • standardize (bool, default=True) -- Standardizes the columns (ROIs) of the concatenated timeseries data. Uses sample standard deviation with Bessel's correction (n-1 in denominator).

  • n_cores (int or None, default=None) -- The number of cores to use for multiprocessing, with Joblib, to run multiple sklearn.cluster.KMeans models if cluster_selection_method is not None. The "loky" backend is used.

  • show_figs (bool, default=False) -- Displays the plots for the specified cluster_selection_method for all groups if cluster_selection_method is not None.

  • output_dir (os.PathLike or None, default=None) -- Directory to save plots as png files if cluster_selection_method is not None. The directory will be created if it does not exist. If None, plots will not be saved.

  • progress_bar (bool, default=False) --

    If True and cluster_selection_method is not None, displays a progress bar.

    Added in version 0.21.5.

  • **kwargs --

    Dictionary to adjust certain parameters when cluster_selection_method is not None. Additional parameters include:

    • S: int, default=1 -- Adjusts the sensitivity of finding the elbow. Larger values are more conservative and less sensitive to small fluctuations. Passed to KneeLocator from the kneed package.

    • dpi: int, default=300 -- Dots per inch for the figure.

    • figsize: tuple, default=(8, 6) -- Adjusts the size of the plots.

    • bbox_inches: str or None, default="tight" -- Alters size of the whitespace in the saved image.

    • step: int, default=None -- An integer value that controls the progression of the x-axis in plots for the specified cluster_selection_method. When set, only integer values will be displayed on the x-axis.

Returns:

self

Note

KMeans Algorithm: Refer to scikit-learn's Documentation for additional information about the KMeans algorithm used in this method.

calculate_metrics(subject_timeseries, tr=None, runs=None, continuous_runs=False, metrics=['temporal_fraction', 'persistence', 'counts', 'transition_frequency'], return_df=True, output_dir=None, prefix_filename=None, progress_bar=False)[source]#

Compute Participant-wise CAP Metrics.

Uses the k-means model (or group-specific k-means models if groups specified during initialization of the CAP class) to assign each subject's TRs to a CAP. Also, creates a single pandas.DataFrame per CAP metric for all participants (with the exception of "transition_probability" which creates a single dataframe per group). As described by Liu et al., 2018 and Yang et al., 2021. The metrics include:

  • "temporal_fraction": The proportion of total volumes spent in a single CAP over all volumes in a run.

    predicted_subject_timeseries = [1, 2, 1, 1, 1, 3]
    target = 1
    temporal_fraction = 4 / 6
    
  • "persistence": The average time spent in a single CAP before transitioning to another CAP (average consecutive/uninterrupted time).

    predicted_subject_timeseries = [1, 2, 1, 1, 1, 3]
    target = 1
    
    # Sequences for 1 are [1] and [1, 1, 1]; There are 2 contiguous sequences
    persistence = (1 + 3) / 2
    
    # Turns average frames into average time = 4
    tr = 2
    if tr: persistence = ((1 + 3) / 2) * 2
    
  • "counts": The total number of initiations of a specific CAP across an entire run. An initiation is defined as the first occurrence of a CAP. If the same CAP is maintained in contiguous segment (indicating stability), it is still counted as a single initiation.

    predicted_subject_timeseries = [1, 2, 1, 1, 1, 3]
    target = 1
    
    # Initiations of CAP-1 occur at indices 0 and 2
    counts = 2
    
  • "transition_frequency": The total number of transitions to different CAPs across the entire run.

    predicted_subject_timeseries = [1, 2, 1, 1, 1, 3]
    
    # Transitions between unique CAPs occur at indices 0 -> 1, 1 -> 2, and 4 -> 5
    transition_frequency = 3
    
  • "transition_probability": The probability of transitioning from one CAP to another CAP (or the same CAP). This is calculated as (Number of transitions from A to B)/ (Total transitions from A). Note that the transition probability from CAP-A -> CAP-B is not the same as CAP-B -> CAP-A.

        # Note last two numbers in the predicted timeseries are switched for this example
        predicted_subject_timeseries = [1, 2, 1, 1, 3, 1]
    
        # If three CAPs were identified in the analysis
        combinations = [(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)]
    
        # Represents transition from CAP-1 -> CAP-2
        target = (1, 2)
    
        # There are 4 ones in the timeseries but only three transitions from 1; 1 -> 2, 1 -> 1, 1 -> 3
        n_transitions_from_1 = 3
    
        # There is only one 1 -> 2 transition
        transition_probability = 1 / 3
    
    **Note**: In the supplementary material for Yang et al., the mathematical relationship between
    temporal fraction, counts, and persistence is ``temporal fraction = (persistence * counts)/total volumes``.
    If persistence has been converted into time units (seconds), then
    ``temporal fraction = (persistence * counts) / (total volumes * TR)``.
    
Parameters:
  • subject_timeseries (dict[str, dict[str, np.ndarray]] or os.PathLike) --

    A dictionary mapping subject IDs to their run IDs and their associated timeseries (TRs x ROIs) as a NumPy array. Can also be a path to a pickle file containing this same structure. The expected structure of is as follows:

    subject_timeseries = {
            "101": {
                "run-0": np.array([...]), # Shape: TRs x ROIs
                "run-1": np.array([...]), # Shape: TRs x ROIs
                "run-2": np.array([...]), # Shape: TRs x ROIs
            },
            "102": {
                "run-0": np.array([...]), # Shape: TRs x ROIs
                "run-1": np.array([...]), # Shape: TRs x ROIs
            }
        }
    

  • tr (float or None, default=None) -- The repetition time (TR) in seconds. If provided, persistence will be calculated as the average uninterrupted time, in seconds, spent in each CAP. If not provided, persistence will be calculated as the average uninterrupted volumes (TRs), in TR units, spent in each state.

  • runs (int, str, list[int], list[str], or None, default=None) -- The run numbers to calculate CAP metrics for (e.g. runs=[0, 1] or runs=["01", "02"]). If None, CAP metrics will be calculated for each run.

  • continuous_runs (bool, default=False) --

    If True, all runs will be treated as a single, uninterrupted run.

    # CAP assignment of frames from for run_1 and run_2
    run_1 = [0, 1, 1]
    run_2 = [2, 3, 3]
    
    # Computation of each CAP metric will be conducted on the combined vector
    continuous_runs = [0, 1, 1, 2, 3, 3]
    

  • metrics ({"temporal_fraction", "persistence", "counts", "transition_frequency", "transition_probability"} or list["temporal_fraction", "persistence", "counts", "transition_frequency", "transition_probability"], default=["temporal_fraction", "persistence", "counts", "transition_frequency"]) -- The metrics to calculate. Available options include "temporal_fraction", "persistence", "counts", "transition_frequency", and "transition_probability".

  • return_df (str, default=True) -- If True, returns pandas.DataFrame inside a dictionary`, mapping each dataframe to their metric.

  • output_dir (os.PathLike or None, default=None) -- Directory to save pandas.DataFrame as csv files. The directory will be created if it does not exist. Dataframes will not be saved if None.

  • prefix_filename (str or None, default=None) -- A prefix to append to the saved file names for each pandas.DataFrame, if output_dir is provided.

  • progress_bar (bool, default=False) --

    If True, displays a progress bar.

    Added in version 0.21.5.

Returns:

dict[str, pd.DataFrame] or dict[str, dict[str, pd.DataFrame]] -- Dictionary containing pandas.DataFrame - one for each requested metric. In the case of "transition_probability", each group has a separate dataframe which is returned in the from of dict[str, dict[str, pd.DataFrame]].

Note

Scaling: If standardizing was requested in self.get_caps(), then the columns/ROIs of the subject_timeseries provided to this method will be scaled using the group-specific mean and sample standard deviation derived from the concatenated data.

Group-Specific CAPs: When the groups parameter is used during initialization of the CAP class, self.get_caps() computes separate k-means model for each group. This means that each group has its own specific k-means model that is used for CAP metric calculations. The inclusion of all groups within the same dataframe (for "temporal_fraction", "persistence", "counts", and "transition_frequency") is primarily to reduce the number of dataframes generated. Hence, each CAP (e.g., "CAP-1") is specific to its respective groups. For instance, "CAP-1" under Group A is distinct from "CAP-1" under Group B.

For instance, if their are two groups, Group A and Group B, each with their own CAPs:

  • A has 2 CAPs: "CAP-1" and "CAP-2"

  • B has 3 CAPs: "CAP-1", "CAP-2", and "CAP-3"

The resulting "temporal_fraction" dataframe ("persistence" and "counts" have a similar structure but "transition frequency" will only contain the "Subject_ID", "Group", and "Run" columns in addition to a "Transition_Frequency" column):

With Groups

Subject_ID

Group

Run

CAP-1

CAP-2

CAP-3

101

A

run-1

0.40

0.60

NaN

102

B

run-1

0.30

0.50

0.20

...

...

...

...

...

...

The "NaN" indicates that "CAP-3" is not applicable for Group A. Additionally, "NaN" will only be observed in instances when two or more groups are specified and have different number of CAPs. As mentioned previously, "CAP-1", "CAP-2", and "CAP-3" for Group A is distinct from Group B due to using separate k-means models.

If no groups were specified during initialization of the CAP class, the resulting "temporal_fraction" dataframe (assuming four CAPs were identified in the k-means model using all participants):

Without Groups

Subject_ID

Group

Run

CAP-1

CAP-2

CAP-3

CAP-4

101

All Subjects

run-1

0.20

0

0

0.80

102

All Subjects

run-1

0.50

0.25

0.25

0

...

...

...

...

...

...

...

Transition Probability: For "transition_probability", each group has a separate dataframe to containing the CAP transitions for each group.

  • Group A Transition Probability: Stored in df_dict["transition_probability"]["A"]

  • Group B Transition Probability: Stored in df_dict["transition_probability"]["B"]

The resulting `"transition_probability"` for Group A:

Subject_ID

Group

Run

1.1

1.2

1.3

2.1

...

101

A

run-1

0.40

0.60

0

0.2

...

...

...

...

...

...

...

...

...

The resulting `"transition_probability"` for Group B:

Subject_ID

Group

Run

1.1

1.2

2.1

...

102

B

run-1

0.70

0.30

0.10

...

...

...

...

...

...

...

...

Here the columns indicate {from}.{to}. For instance, column 1.2 indicates the probability of transitioning from CAP-1 to CAP-2.

If no groups are specified, then the dataframe is stored in df_dict["transition_probability"]["All Subjects"].

For the "Group" column, whitespace in group names no longer replaced with underscores in versions >=0.17.11.

References

Liu, X., Zhang, N., Chang, C., & Duyn, J. H. (2018). Co-activation patterns in resting-state fMRI signals. NeuroImage, 180, 485–494. https://doi.org/10.1016/j.neuroimage.2018.01.041

Yang, H., Zhang, H., Di, X., Wang, S., Meng, C., Tian, L., & Biswal, B. (2021). Reproducible coactivation patterns of functional brain networks reveal the aberrant dynamic state transition in schizophrenia. NeuroImage, 237, 118193. https://doi.org/10.1016/j.neuroimage.2021.118193

caps2plot(output_dir=None, suffix_title=None, suffix_filename=None, plot_options='outer_product', visual_scope='regions', show_figs=True, subplots=False, **kwargs)[source]#

Generate Heatmaps and Outer Product Plots for CAPs.

Plot CAPs as heatmaps or outer products at the node or region/network levels. This function produces a seaborn.heatmap for each CAP. Note, if groups were given when the CAP class was initialized, separate plots will be generated for all groups.

Parameters:
  • output_dir (os.PathLike or None, default=None) -- Directory for saving plots as png files. The directory will be created if it does not exist. If None, plots will not be saved.

  • suffix_title (str or None, default=None) -- Appended to the title of each plot.

  • suffix_filename (str or None, default=None) -- Appended to the filename of each saved plot if output_dir is provided.

  • plot_options ({"outer_product", "heatmap"} or list["outer_product", "heatmap"], default="outer_product") -- Type of plots to create. Options are "outer_product" or "heatmap".

  • visual_scope ({"regions", "nodes"} or list["regions", "nodes"], default="regions") -- Determines whether plotting is done at the region level or node level. For "regions", the values of all nodes in the same regions (including both hemispheres) are averaged together then plotted. For "nodes", plots individual node values separately.

  • show_figs (bool, default=True) -- Display figures.

  • subplots (bool, default=True) -- Produce subplots for outer product plots, combining all plots into a single figure.

  • **kwargs --

    Keyword arguments used when saving figures. Valid keywords include:

    • dpi: int, default=300 -- Dots per inch for the figure.

    • figsize: tuple, default=(8, 6) -- Size of the figure in inches.

    • fontsize: int, default=14 -- Font size for the title of individual plots or subplots.

    • hspace: float, default=0.4 -- Height space between subplots.

    • wspace: float, default=0.4 -- Width space between subplots.

    • xticklabels_size: int, default=8 -- Font size for x-axis tick labels.

    • yticklabels_size: int, default=8 -- Font size for y-axis tick labels.

    • shrink: float, default=0.8 -- Fraction by which to shrink the colorbar.

    • cbarlabels_size: int, default=8 -- Font size for the colorbar labels.

    • nrow: int, default=varies (max 5) -- Number of rows for subplots. Default varies but the maximum is 5.

    • ncol: int or None, default=None -- Number of columns for subplots. Default varies but the maximum is 5.

    • suptitle_fontsize: float, default=0.7 -- Font size for the main title when subplot is True.

    • tight_layout: bool, default=True -- Use tight layout for subplots.

    • rect: list[int], default=[0, 0.03, 1, 0.95] -- Rectangle parameter for tight_layout when subplots are True. Fixes whitespace issues.

    • sharey: bool, default=True -- Share y-axis labels for subplots.

    • xlabel_rotation: int, default=0 -- Rotation angle for x-axis labels.

    • ylabel_rotation: int, default=0 -- Rotation angle for y-axis labels.

    • annot: bool, default=False -- Add values to cells.

    • annot_kws: dict, default=None -- Customize the annotations.

    • fmt: str, default=".2g" -- Modify how the annotated vales are presented.

    • linewidths: float, default=0 -- Padding between each cell in the plot.

    • borderwidths: float, default=0 -- Width of the border around the plot.

    • linecolor: str, default="black" -- Color of the line that seperates each cell.

    • edgecolors: str or None, default=None -- Color of the edges.

    • alpha: float or None, default=None -- Controls transparency and ranges from 0 (transparent) to 1 (opaque).

    • bbox_inches: str or None, default="tight" -- Alters size of the whitespace in the saved image.

    • hemisphere_labels: bool, default=False -- When visual_scope="nodes", shows simplified left/right hemisphere division line instead of individual node labels (only for Custom and Schaefer parcellations). For Custom parcellations, assumes labels are ordered such that all left hemisphere nodes come before right hemisphere nodes. Ignores edgecolors; linewidths/linecolor only affect division line.

    • cmap: str, callable default="coolwarm" -- Color map for the plot cells. Options include strings to call seaborn's pre-made palettes, seaborn.diverging_palette function to generate custom palettes, and matplotlib.color.LinearSegmentedColormap to generate custom palettes.

    • vmin: float or None, default=None -- The minimum value to display in colormap.

    • vmax: float or None, default=None -- The maximum value to display in colormap.

Returns:

self

Note

Parcellation Approach: the "nodes" and "regions" sub-keys are required in parcel_approach for this function.

Color Palettes: Refer to seaborn's Color Palettes for valid pre-made palettes.

caps2corr(output_dir=None, suffix_title=None, suffix_filename=None, show_figs=True, save_plots=True, return_df=False, save_df=False, **kwargs)[source]#

Generate Pearson Correlation Matrix for CAPs.

Produces a correlation matrix of all CAPs and visualizes it using seaborn.heatmap. Can also produce a pandas Dataframe of the correlation matrix where each element contains its uncorrected p-value in parenthesis, with a single asterisk if < 0.05, a double asterisk if < 0.01, and a triple asterisk < 0.001. Note, if groups were given when the CAP class was initialized, separate correlation matrices will be generated for all groups.

Parameters:
  • output_dir (os.PathLike or None, default=None) -- Directory to save plots (if save_plots is True) and correlation matrices DataFrames (if save_df is True). The directory will be created if it does not exist. If None, plots and dataFrame will not be saved.

  • suffix_title (str or None, default=None) -- Appended to the title of each plot.

  • suffix_filename (str or None, default=None) -- Appended to the filename of each saved plot if output_dir is provided.

  • show_figs (bool, default=True) -- Display figures.

  • save_plots (bool, default=True) -- If True, plots are saves as png images. For this to be used, output_dir must be specified.

  • return_df (bool, default=False) -- If True, returns a dictionary with a correlation matrix for each group.

  • save_df (bool, default=False,) -- If True, saves the correlation matrix contained in the DataFrames as csv files. For this to be used, output_dir must be specified.

  • **kwargs --

    Keyword arguments used when modifying figures. Valid keywords include:

    • dpi: int, default=300 -- Dots per inch for the figure.

    • figsize: tuple, default=(8, 6) -- Size of the figure in inches.

    • fontsize: int, default=14 -- Font size for the title each plot.

    • xticklabels_size: int, default=8 -- Font size for x-axis tick labels.

    • yticklabels_size: int, default=8 -- Font size for y-axis tick labels.

    • shrink: float, default=0.8 -- Fraction by which to shrink the colorbar.

    • cbarlabels_size: int, default=8 -- Font size for the colorbar labels.

    • xlabel_rotation: int, default=0 -- Rotation angle for x-axis labels.

    • ylabel_rotation: int, default=0 -- Rotation angle for y-axis labels.

    • annot: bool, default=False -- Add values to each cell.

    • annot_kws: dict, default=None -- Customize the annotations.

    • fmt: str, default=".2g" -- Modify how the annotated vales are presented.

    • linewidths: float, default=0 -- Padding between each cell in the plot.

    • borderwidths: float, default=0 -- Width of the border around the plot.

    • linecolor: str, default="black" -- Color of the line that seperates each cell.

    • edgecolors: str or None, default=None -- Color of the edges.

    • alpha: float or None, default=None -- Controls transparency and ranges from 0 (transparent) to 1 (opaque).

    • bbox_inches: str or None, default="tight" -- Alters size of the whitespace in the saved image.

    • cmap: str, callable default="coolwarm" -- Color map for the plot cells. Options include strings to call seaborn's pre-made palettes, seaborn.diverging_palette function to generate custom palettes, and matplotlib.color.LinearSegmentedColormap to generate custom palettes.

    • vmin: float or None, default=None -- The minimum value to display in colormap.

    • vmax: float or None, default=None -- The maximum value to display in colormap.

Returns:

dict[str, pd.DataFrame] -- An instance of a pandas DataFrame for each group.

Note

Color Palettes: Refer to seaborn's Color Palettes for valid pre-made palettes.

caps2niftis(output_dir, suffix_filename=None, fwhm=None, knn_dict=None, progress_bar=False)[source]#

Standalone Method to Convert CAPs to NifTI Statistical Maps.

Projects CAPs onto the parcellation in self.parcel_approach to create NifTI statistical maps by replacing parcellation labels with their corresponding CAP (cluster centroid) values. Creates compressed NifTI (.nii.gz) files. One image is generated per CAP. Note, if groups were given when the CAP class was initialized, separate NifTI images will be generated per CAP for all groups.

Parameters:
  • output_dir (os.PathLike) -- Directory to save nii.gz files. The directory will be created if it does not exist.

  • suffix_filename (str or None, default=None) -- Appended to the name of the saved file.

  • fwhm (float or None, default=None) -- Strength of spatial smoothing to apply (in millimeters) to the statistical map prior to interpolating from MNI152 space to fsLR surface space. Uses Nilearn's image.smooth_img.

  • knn_dict (dict[str, Union[int, bool]], default=None) --

    Use KNN (k-nearest neighbors) interpolation with reference atlas masking to fill in non-background coordinates that are assigned zero. Useful when custom parcellation does not project well from volumetric to surface space. The following sub-keys are recognized:

    • "k": An integer. Determines the number of nearest neighbors to consider. Default is 1.

    • "reference_atlas": A string. Specifies the atlas to use for reference masking ("AAL" or "Schaefer"). Default is "Schaefer".

    • "resolution_mm": An integer (1 or 2). Spatial resolution of the Schaefer parcellation (in millimeters). Default is 1.

    • "remove_labels": A list or array. The label IDs as integers of the regions in the parcellation to not interpolate.

    Note: This method is applied before the fwhm.

  • progress_bar (bool, default=False) --

    If True, displays a progress bar.

    Added in version 0.21.5.

Returns:

self

Note

Assumption: This function assumes that the background label for the parcellation is zero. Additionaly, the following approach is used to map each CAP onto the parcellation.

atlas = nib.load(atlas_file)
atlas_fdata = atlas.get_fdata()
# Create array of zeroes with same dimensions as atlas
atlas_array = np.zeros_like(atlas_fdata)

# Get array containing all labels in parcellation in order
target_array = sorted(np.unique(atlas_fdata))

# Start at 1 to avoid assigment to the background label
for indx, value in enumerate(cap_vector, start=1):
    atlas_array[atlas_fdata == target_array[indx]] = value
caps2surf(output_dir=None, suffix_title=None, suffix_filename=None, show_figs=True, fwhm=None, fslr_density='32k', method='linear', save_stat_maps=False, fslr_giftis_dict=None, knn_dict=None, progress_bar=False, **kwargs)[source]#

Project CAPs onto Surface Plots.

Plot CAPs on cortical surface in fsLR space. First, projects CAPs onto parcellation to create NifTI statistical maps by replacing parcellation labels with their corresponding CAP (cluster centroid) values. Then uses neuromap's transforms.mni152_to_fslr for coordinate system transformation and surfplot's Plot for plotting. If CAPs where already converted to NifTI (self.caps2niftis) and transformed to fsLR GifTI files externally, these can be provided using the fslr_giftis_dict parameter and will be converted to a suitable format for surfplot's Plot function by using neuromap's transforms.fslr_to_fslr function. Note, if groups were given when the CAP class was initialized, surface plots will be generated per CAP for all groups.

Parameters:
  • output_dir (os.PathLike or None, default=None) -- Directory to save plots as png files. The directory will be created if it does not exist. If None, plots will not be saved.

  • suffix_title (str or None, default=None) -- Appended to the title of each plot.

  • suffix_filename (str or None, default=None) -- Appended to the filename of each saved plot if output_dir is provided.

  • show_figs (bool, default=True) -- Display figures.

  • fwhm (float or None, defualt=None) -- Strength of spatial smoothing to apply (in millimeters) to the statistical map prior to interpolating from MNI152 space to fsLR surface space. Uses Nilearn's image.smooth.

  • fslr_density ({"4k", "8k", "32k", "164k"}, default="32k") -- Density of the fsLR surface when converting from MNI152 space to fsLR surface. Options are "32k" or "164k". If using fslr_giftis_dict options are "4k", "8k", "32k", and "164k".

  • method ({"linear", "nearest"}, default="linear") -- Interpolation method to use when converting from MNI152 space to fsLR surface or from fsLR to fsLR. Options are "linear" or "nearest".

  • save_stat_maps (bool, default=False) -- If True, saves the statistical map for each CAP for all groups as a Nifti1Image if output_dir is provided.

  • fslr_giftis_dict (dict or None, default=None) --

    Dictionary specifying precomputed GifTI files in fsLR space for plotting statistical maps. This parameter should be used if the statistical CAP NIfTI files (can be obtained using self.caps2niftis) were converted to GifTI files using a tool such as Connectome Workbench. The dictionary structure is:

    {
        "GroupName": {
            "CAP-Name": {
                "lh": "path/to/left_hemisphere_gifti",
                "rh": "path/to/right_hemisphere_gifti"
            }
        }
    }
    

    "GroupName" can be "All Subjects" or any specific group name. CAP-Name is the name of the CAP. This parameter allows plotting without re-running the analysis. Initialize the CAP class and use this method if using this parameter.

  • knn_dict (dict[str, Union[int, bool]], default=None) --

    Use KNN (k-nearest neighbors) interpolation with reference atlas masking to fill in non-background coordinates that are assigned zero. Useful when custom parcellation does not project well from volumetric to surface space. The following sub-keys are recognized:

    • "k": An integer. Determines the number of nearest neighbors to consider. Default is 1.

    • "reference_atlas": A string. Specifies the atlas to use for reference masking ("AAL" or "Schaefer"). Default is "Schaefer".

    • "resolution_mm": An integer (1 or 2). Spatial resolution of the Schaefer parcellation (in millimeters). Default is 1.

    • "remove_labels": A list or array. The label IDs as integers of the regions in the parcellation to not interpolate.

    Note: This method is applied before the fwhm.

  • progress_bar (bool, default=False) --

    If True, displays a progress bar.

    Added in version 0.21.5.

  • **kwargs --

    Additional parameters to pass to modify certain plot parameters. Options include:

    • dpi: int, default=300 -- Dots per inch for the plot.

    • title_pad: int, default=-3 -- Padding for the plot title.

    • cmap: str or callable, default="cold_hot" -- Colormap to be used for the plot.

    • cbar_kws: dict, default={"location": "bottom", "n_ticks": 3} -- Customize colorbar. Refer to _add_colorbars for surfplot.plotting.Plot in Surfplot's Plot Documentation for valid parameters.

    • alpha: float, default=1 -- Transparency level of the colorbar.

    • as_outline: bool, default=False -- Plots only an outline of contiguous vertices with the same value.

    • outline_alpha: float, default=1 -- Transparency level of the colorbar for outline if as_outline is True.

    • zero_transparent: bool, default=True -- Turns vertices with a value of 0 transparent.

    • size: tuple, default=(500, 400) -- Size of the plot in pixels.

    • layout: str, default="grid" -- Layout of the plot.

    • zoom: float, default=1.5 -- Zoom level for the plot.

    • views: {"lateral", "medial"} or list[{"lateral", "medial}], default=["lateral", "medial"] -- Views to be displayed in the plot.

    • brightness: float, default=0.5 -- Brightness level of the plot.

    • figsize: tuple or None, default=None -- Size of the figure.

    • scale: tuple, default=(2, 2) -- Scale factors for the plot.

    • surface: {"inflated", "veryinflated"}, default="inflated" -- The surface atlas that is used for plotting. Options are "inflated" or "veryinflated".

    • color_range: tuple or None, default=None -- The minimum and maximum value to display in plots. For instance, (-1, 1) where minimum value is first. If None, the minimum and maximum values from the image will be used.

    • bbox_inches: str or None, default="tight" -- Alters size of the whitespace in the saved image.

Returns:

self

Note

Parcellation Approach: parcel_approach must have the "maps" sub-key containing the path to th NifTI file of the parcellation.

Assumptions: This function assumes that the background label for the parcellation is zero and that is in MNI space. Additionally, the following approach is taken to map the each CAP onto the parcellation

atlas = nib.load(atlas_file)
atlas_fdata = atlas.get_fdata()
# Create array of zeroes with same dimensions as atlas
atlas_array = np.zeros_like(atlas_fdata)

# Get array containing all labels in parcellation in order
target_array = sorted(np.unique(atlas_fdata))

# Start at 1 to avoid assigment to the background label
for indx, value in enumerate(cap_vector, start=1):
    atlas_array[atlas_fdata == target_array[indx]] = value
caps2radar(output_dir=None, suffix_title=None, suffix_filename=None, show_figs=True, use_scatterpolar=False, as_html=False, **kwargs)[source]#

Generate Radar Plots for CAPs using Cosine Similarity.

Calculates the cosine similarity between the "High Amplitude" (positive/above the mean) and "Low Amplitude" (negative/below the mean) activations of the CAP cluster centroid and each a-priori region or network in a parcellation. This function assumes the mean for each ROI is 0 due to standardization.

Cosine similarity is computed separately for high and low amplitudes by comparing them to the binary vector of the a-priori region, representing the region or network of interest. This provides a measure of how closely the CAP's positive and negative activation patterns align with each selected region.

The process involves the following steps:

  1. Extract Cluster Centroids:

  • Each CAP is represented by a cluster centroid, which is a 1 x ROI (Region of Interest) vector.

  1. Generate Binary Vectors:

  • For each region create a binary vector (1 x ROI) where 1 indicates that the ROI is part of the specific region and 0 otherwise.

  • In this example, the binary vector acts as a 1D mask to isolate ROIs in the Visual Network by setting the corresponding indices to 1.

    import numpy as np
    
    # Define nodes with their corresponding label IDs
    nodes = ["LH_Vis1", "LH_Vis2", "LH_SomSot1", "LH_SomSot2",
             "RH_Vis1", "RH_Vis2", "RH_SomSot1", "RH_SomSot2"]
    
    # Binary mask for the Visual Network (Vis)
    binary_vector = np.array([1, 1, 0, 0, 1, 1, 0, 0])
    
  1. Isolate Positive and Negative Activations in CAP Centroid:

  • Positive activations are defined as the values in the CAP centroid that are greater than zero. These values represent the "High Amplitude" activations for that CAP.

  • Negative activations are defined as the values in the CAP centroid that are less than zero. These values represent the "Low Amplitude" activations for that CAP.

  • To simplify the comparison between positive and negative activations using cosine similarity, the negative activations are inverted (i.e., multiplied by -1). This inversion converts the negative values into positive ones, allowing the cosine similarity calculation to return a positive value. The positive similarity score represents how closely the a-priori region aligns with both the high and low amplitude aspects of the CAP.

    # Example cluster centroid for CAP 1
    cap_1_cluster_centroid = np.array([-0.3, 1.5, 2.0, -0.2, 0.7, 1.3, -0.5, 0.4])
    
    # Assign values less than 0 as 0 to isolate the high amplitude activations
    high_amp = np.where(cap_1_cluster_centroid > 0, cap_1_cluster_centroid, 0)
    
    # Assign values less than 0 as 0 to isolate the low amplitude activations; Also invert the sign
    low_amp = high_amp = np.where(cap_1_cluster_centroid < 0, -cap_1_cluster_centroid, 0)
    
  1. Calculate Cosine Similarity:

  • Normalize the dot product by the product of the Euclidean norms of the cluster centroid and the binary vector to obtain the cosine similarity:

    # Compute dot product between the binary vector with the positive and negative activations
    high_dot = np.dot(high_amp, binary_vector)
    low_dot = np.dot(low_amp, binary_vector)
    
    # Compute the norms
    high_norm = np.linalg.norm(high_amp)
    low_norm = np.linalg.norm(low_amp)
    bin_norm = np.linalg.norm(binary_vector)
    
    # Calculate cosine similarity
    high_cos = high_dot / (high_norm * bin_norm)
    low_cos = low_dot / (low_norm * bin_norm)
    
  1. Generate Radar Plots of Each CAPs:

  • Each radar plot visualizes the cosine similarity for both "High Amplitude" (positive) and "Low Amplitude" (negative) activations of the CAP. The cosine similarity values range from 0 to 1, representing how closely the a-priori region aligns with the positive and negative activations of the CAP centroid.

Note, if groups were given when the CAP class was initialized, separate radar plots will be generated per CAP for all groups.

Parameters:
  • output_dir (os.PathLike or None, default=None) -- Directory to save plots as png or html images. The directory will be created if it does not exist. If None, plots will not be saved.

  • suffix_title (str or None, default=None) -- Appended to the title of each plot.

  • suffix_filename (str or None, default=None) -- Appended to the filename of each saved plot if output_dir is provided.

  • show_figs (bool, default=True) -- Display figures. If the current Python session is non-interactive, then plotly.offline is used to generate an html file named "temp-plot.html", which opens each plot in the default browser.

  • use_scatterpolar (bool, default=False) -- Uses plotly.graph_objects.Scatterpolar instead of plotly.express.line_polar.

  • as_html (bool, default=False) -- When output_dir is specified, plots are saved as html images instead of png images.

  • **kwargs --

    Additional parameters to pass to modify certain plot parameters. Options include:

    • scale: int, default=2 -- If output_dir provided, controls resolution of image when saving. Serves a similar purpose as dpi.

    • savefig_options: dict[str], default={"width": 3, "height": 3, "scale": 1} -- If output_dir provided, controls the width (in inches), height (in inches), and scale of the plot.

    • height: int, default=800 -- Height of the plot.

    • width: int, defualt=1200 -- Width of the plot.

    • line_close: bool, default=True -- Whether to close the lines

    • bgcolor: str, default="white" -- Color of the background

    • scattersize: int, default=8 -- Controls size of the dots when markers are used.

    • connectgaps: bool, default=True -- If use_scatterpolar=True, controls if missing values are connected.

    • linewidth: int, default = 2 -- The width of the line connecting the values if use_scatterpolar=True.

    • opacity: float, default=0.5 -- If use_scatterpolar=True, sets the opacity of the trace.

    • fill: str, default="none" -- If "toself" the are of the dots and within the boundaries of the line will be filled.

    • mode: str, default="markers+lines" -- Determines how the trace is drawn. Can include "lines", "markers", "lines+markers", "lines+markers+text".

    • radialaxis: dict, default={"showline": False, "linewidth": 2, "linecolor": "rgba(0, 0, 0, 0.25)", "gridcolor": "rgba(0, 0, 0, 0.25)", "ticks": "outside", "tickfont": {"size": 14, "color": "black"}} -- Customizes the radial axis. Refer to Plotly's radialaxis Documentation or Plotly's polar Documentation for valid keys.

    • angularaxis: dict, default={"showline": True, "linewidth": 2, "linecolor": "rgba(0, 0, 0, 0.25)", "gridcolor": "rgba(0, 0, 0, 0.25)", "tickfont": {"size": 16, "color": "black"}} -- Customizes the angular axis. Refer to Plotly's angularaxis Documentation or Plotly's polar Documentation for valid keys.

    • color_discrete_map: dict, default={"High Amplitude": "red", "Low Amplitude": "blue"} -- Change the color of the "High Amplitude" and "Low Amplitude" groups. Must use the keys "High Amplitude" and "Low Amplitude".

    • title_font: dict, default={"family": "Times New Roman", "size": 30, "color": "black"} -- Modifies the font of the title. Refer to Plotly's layout Documentation for valid keys.

    • title_x: float, default=0.5 -- Modifies x position of title.

    • title_y: float, default=None -- Modifies y position of title.

    • legend: dict, default={"yanchor": "top", "xanchor": "left", "y": 0.99, "x": 0.01, "title_font_family": "Times New Roman", "font": {"size": 12, "color": "black"}} -- Customizes the legend. Refer to Plotly's layout Documentation for valid keys

    • engine: {"kaleido", "orca"}, default="kaleido" -- Engine used for saving plots.

Returns:

self

Note

Handling Division by Zero: NumPy automatically handles division by zero errors. This may occur if the network or the "High Amplitude" or "Low Amplitude" vectors are all zeroes. In such cases, NumPy assigns NaN to the cosine similarity for the affected network(s), indicating that the similarity is undefined. Plotly is capable of handling NaN values.

Parcellation Approach: If using "Custom" for parcel_approach the "regions" sub-key is required.

Saving Plots: By default, this function uses "kaleido" (which is also a dependency in this package) to save plots. For other engines such as "orca", those packages must be installed seperately.

Tick Values: if the tickvals or range sub-keys in this code are not specified in the radialaxis kwarg, then four values are shown - 0.25*(max value), 0.50*(max value), 0.75*(max value), and the max value. These values are also rounded to the second decimal place.

References

Zhang, R., Yan, W., Manza, P., Shokri-Kojori, E., Demiral, S. B., Schwandt, M., Vines, L., Sotelo, D., Tomasi, D., Giddens, N. T., Wang, G., Diazgranados, N., Momenan, R., & Volkow, N. D. (2023). Disrupted brain state dynamics in opioid and alcohol use disorder: attenuation by nicotine use. Neuropsychopharmacology, 49(5), 876–884. https://doi.org/10.1038/s41386-023-01750-w

Ingwersen, T., Mayer, C., Petersen, M., Frey, B. M., Fiehler, J., Hanning, U., Kühn, S., Gallinat, J., Twerenbold, R., Gerloff, C., Cheng, B., Thomalla, G., & Schlemm, E. (2024). Functional MRI brain state occupancy in the presence of cerebral small vessel disease — A pre-registered replication analysis of the Hamburg City Health Study. Imaging Neuroscience, 2, 1–17. https://doi.org/10.1162/imag_a_00122