neurocaps.extraction.TimeseriesExtractor#

class TimeseriesExtractor(space='MNI152NLin2009cAsym', parcel_approach={'Schaefer': {'n_rois': 400, 'resolution_mm': 1, 'yeo_networks': 7}}, standardize='zscore_sample', detrend=True, low_pass=None, high_pass=None, fwhm=None, use_confounds=True, confound_names='basic', fd_threshold=None, n_acompcor_separate=None, dummy_scans=None, dtype=None)[source]#

Timeseries Extractor Class.

Performs timeseries denoising, extraction, serialization (pickling), and BOLD visualization.

Parameters:
  • space (str, default=”MNI152NLin2009cAsym”) – The standard template space that the preprocessed bold data is registered to. Used for querying with PyBIDS to locate preprocessed BOLD-related files.

  • parcel_approach (ParcelConfig, ParcelApproach, or str, default={“Schaefer”: {“n_rois”: 400, “yeo_networks”: 7, “resolution_mm”: 1}}) –

    The approach used to parcellate NifTI images into distinct regions-of-interests (ROIs).

    To initialize a parcel_approach, the configuration requires a nested dictionary with:

    • First Level Key: The parcellation name (“Schaefer”, “AAL”, or “Custom”).

    • Second Level Keys: Parameters specific to each parcellation method.

    Supported parcellation approaches and their parameters, includes:

    • ”Schaefer”:

      • ”n_rois”: Number of ROIs (Default=400). Options are 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000.

      • ”yeo_networks”: Number of Yeo networks (Default=7). Options are 7 or 17.

      • ”resolution_mm”: Spatial resolution in millimeters (Default=1). Options are 1 or 2.

    • ”AAL”:

      • ”version”: AAL parcellation version to use (Default=”SPM12” if {"AAL": {}} is given). Options are “SPM5”, “SPM8”, “SPM12”, or “3v2”.

    • ”Custom” (user-defined):

      • ”maps”: Directory path to the location of the parcellation file.

      • ”nodes”: A list of node names in the order of the label IDs in the parcellation.

      • ”regions”: The regions or networks in the parcellation.

    Notes:

  • standardize ({"zscore_sample", "zscore", "psc", True, False}, default="zscore_sample") –

    Standardizes the timeseries.

    Note: Refer to nilearn.maskers.NiftiLabelsMasker for an explanation of each available option.

  • detrend (bool, default=True) – Detrends the timeseries.

  • low_pass (float, int, or None, default=None) – Filters out signals above the specified cutoff frequency.

  • high_pass (float, int, or None`, default=None) – Filters out signals below the specified cutoff frequency.

  • fwhm (float, int, or None, default=None) – Applies spatial smoothing to data (in millimeters).

  • use_confounds (bool, default=True) –

    If True, performs nuisance regression during timeseries extraction using the default or user-specified confounds in confound_names.

    Note: requires that confound tsv files to be in same directory as preprocessed BOLD images.

  • confound_names ({“basic”}, list[str], or None, default=”basic”) –

    Names of confounds extracted from the confound tsv files if use_confounds=True.

    If “basic”, the following confounds are used by default:

    • All cosine-basis parameters.

    • Six head-motion parameters and their first-order derivatives.

    • First six combined aCompcor components.

    Notes:

    • Confound names follow fMRIPrep’s naming scheme (versions >= 1.2.0).

    • Wildcards are supported: e.g., “cosine*” matches all confounds starting with “cosine”.

    Changed in version 0.23.0: Changed default from None to "basic". The "basic" option provides the same functionality that None did in previous versions.

  • fd_threshold (float, dict[str, float | int], or None, default=None) –

    Threshold for volume censoring based on framewise displacement (FD).

    • If float, removes volumes where FD > threshold.

    • If dict, the following subkeys are available:

      • ”threshold”: A float (Default=None). Removes volumes where FD > threshold.

      • ”outlier_percentage”: A float in interval [0,1] (Default=None). Removes entire runs where proportion of censored volumes exceeds this threshold. Proportion calculated after dummy scan removal. Issues warning when runs are flagged. If condition specified in self.get_bold(), only considers volumes associated with the condition.

      • ”n_before”: An integer indicating the number of volumes to remove before each flagged volume (Default=None). For instance, if volume 5 flagged and {"n_before": 2}, then volumes 3, 4, and 5 are discarded.

      • ”n_after”: An integer indicating the of volumes to remove after each flagged volume (Default=False). For instance, if volume 5 flagged and {"n_after": 2}, then volumes 5, 6, and 7 are discarded.

      • ”use_sample_mask”: A boolean (Default=False). If True, censors before nuisance regression using Nilearn’s NiftiLabelsMasker. Also, sets clean__extrapolate=False to prevent interpolation of end volumes. If False, censors after nuisance regression.

      • ”interpolate”: A boolean (Default=None). If True, uses scipy’s CubicSpline function with extrapolate=False to perform cubic spline interpolation on censored frames that are not at the ends of the timeseries. For example, given a censor_mask=[0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0] where “0” indicates censored volumes, only the volumes at index 3, 5, 6, and 8 would be interpolated. When False or None (default behavior), no interpolation is performed and all censored frames are discarded.

        Added in version 0.22.3: “interpolate” key added.

    Notes:

    • A column named “framewise_displacement” must be available in the confounds file.

    • use_confounds must be set to True.

    • Do not specify “framewise_displacement” in confound_names.

    • See Nilearn’s documentation for details on censored volume handling:

    • When {"use_sample_mask": False} and standardize=True, applying an additional within-run standardization (using neurocaps.analysis.standardize) is recommended after outlier removal.

    • If {"interpolation: True}, then interpolation is only applied nuisance regression and parcellation steps have been completed. It is also applied prior to the condition being extracted from the timeseries.

    • See Scipy’s documentation on their CubicSpline function.

  • n_acompcor_separate (int or None, default=None) –

    Number of aCompCor components to extract separately from the white-matter (WM) and CSF masks. Uses first “n” components from each mask separately. For instance, if n_acompcor_separate=5, then the the first 5 WM components and the first 5 CSF components (totaling 10 components) are regressed out.

    Notes: - use_confounds must be set to True. - If specified, this parameter overrides any aCompCor components listed in confound_names.

  • dummy_scans (int, dict[str, bool | int], or None, default=None) –

    Number of initial volumes to remove before timeseries extraction.

    • If int, removes first “n” volumes.

    • If dict, the following keys are supported:

      • ”auto”: A boolean (Default=None). If True, Automatically determines dummy scans from fMRIPrep confounds file by counting the number of “non_steady_state_outlier_XX” columns in confounds.tsv file. For instance, if two columns are found,then the first two columns are removed.

      • ”min”: An integer (Default=None). Minimum volumes to remove when auto is set to True. If “auto” finds 2 outliers but {"min": 3}, removes 3 volumes.

      • ”max”: An integer (Default=None). Maximum volumes to remove when auto=True. If “auto” finds 6 outliers but {"max": 5}, removes 5 volumes.

    Note: “min” and “max” keys only apply when “auto” is True.

  • dtype (str or “auto”, default=None) – The NumPy dtype the NIfTI images are converted to when passed to Nilearn’s load_img function.

Properties#

space: str

The standard template space that the preprocessed BOLD data is registered to. The space can also be set after class initialization using self.space = "New Space" if the template space needs to be changed.

parcel_approach: ParcelApproach

A dictionary containing information about the parcellation. Can also be used as a setter, which accepts a dictionary or a dictionary saved as pickle file. If “Schaefer” or “AAL” was specified during initialization of the TimeseriesExtractor class, then nilearn.datasets.fetch_atlas_schaefer_2018 and nilearn.datasets.fetch_atlas_aal will be used to obtain the “maps” and the “nodes”. Then string splitting is used on the “nodes” to obtain the “regions”:

# Structure of Schaefer
{
    "Schaefer":
    {
        "maps": "path/to/parcellation.nii.gz",
        "nodes": ["LH_Vis1", "LH_SomSot1", "RH_Vis1", "RH_Somsot1"],
        "regions": ["Vis", "SomSot"]
    }
}

# Structure of AAL
{
    "AAL":
    {
        "maps": "path/to/parcellation.nii.gz",
        "nodes": ["Precentral_L", "Precentral_R", "Frontal_Sup_L", "Frontal_Sup_R"],
        "regions": ["Precentral", "Frontal"]
    }
}

Refer to the example for “Custom” in the Note section below for the expected structure.

signal_clean_info: dict[str, bool | int | float | str] or None

Dictionary containing parameters for signal cleaning specified during initialization of the TimeseriesExtractor class. This information includes standardize, detrend, low_pass, high_pass, fwhm, dummy_scans, use_confounds, n_compcor_separate, and fd_threshold.

task_info: dict[str, str | int] or None

If self.get_bold() ran, is a dictionary containing all task-related information such as task, condition, session, runs, and tr (if specified) else None.

subject_ids: list[str] or None

A list containing all subject IDs that have retrieved from PyBIDS and subjected to timeseries extraction.

n_cores: int or None

Number of cores used for multiprocessing with Joblib.

subject_timeseries: SubjectTimeseries or None

A dictionary mapping subject IDs to their run IDs and their associated timeseries (TRs x ROIs) as a NumPy array. Can also be a path to a pickle file containing this same structure. If this property needs to be deleted due to memory issues, del self.subject_timeseries can be used to delete this property and only have it return None. The structure is as follows:

subject_timeseries = {
        "101": {
            "run-0": np.array([...]), # Shape: TRs x ROIs
            "run-1": np.array([...]), # Shape: TRs x ROIs
            "run-2": np.array([...]), # Shape: TRs x ROIs
        },
        "102": {
            "run-0": np.array([...]), # Shape: TRs x ROIs
            "run-1": np.array([...]), # Shape: TRs x ROIs
        }
    }

See also

neurocaps.typing.ParcelConfig

Type definition representing the configuration options and structure for the Schaefer and AAL parcellations.

neurocaps.typing.ParcelApproach

Type definition representing the structure of the Schaefer, AAL, and Custom parcellation approaches.

neurocaps.typing.SubjectTimeseries

Type definition representing the structure of the subject timeseries.

Note

Passed Parameters: standardize, detrend, low_pass, high_pass, fwhm, and nuisance regression (confound_names) uses nilearn.maskers.NiftiLabelsMasker. The dtype parameter is used by nilearn.image.load_img. For framewise displacement, if the “use_sample_mask” key is set to True in the fd_threshold dictionary, then a boolean sample mask is generated (setting indices corresponding to high motion volumes as False) and is passed to the sample_mask parameter in nilearn.maskers.NiftiLabelsMasker.

Custom Parcellations: If using a “Custom” parcellation approach, ensure that the parcellation is lateralized (where each region/network has nodes in the left and right hemisphere). This is due to certain visualization functions assuming that each region consists of left and right hemisphere nodes. Additionally, certain visualization functions in this class also assume that the background label is 0. Therefore, do not add a background label in the “nodes” or “regions” keys.

The recognized subkeys for the “Custom” parcellation approach includes:

  • “maps”: Directory path containing the parcellation in a supported format (e.g., .nii or .nii.gz for NifTI).

  • “nodes”: A list or numpy array of all node labels arranged in ascending order based on their numerical IDs from the parcellation. The 0th index should contain the label corresponding to the lowest, non-background numerical ID.

  • “regions”: A dictionary defining major brain regions or networks, with each region containing “lh” (left hemisphere) and “rh” (right hemisphere) subkeys listing node indices.

Refer to the NeuroCAPs’ Parcellation Documentation for more detailed explanations and example structures for the “nodes” and “regions” subkeys.

Note: Different subkeys are required depending on the function used. Refer to the Note section under each function for information regarding the subkeys required for that specific function.

Methods

get_bold(bids_dir, task[, session, runs, ...])

Retrieve Preprocessed BOLD Data from BIDS Datasets.

timeseries_to_pickle(output_dir[, filename])

Save the Extracted Subject Timeseries.

visualize_bold(subj_id, run[, roi_indx, ...])

Plot the Extracted Subject Timeseries.