mvpa2.datasets.sources.openfmri.OpenFMRIDataset¶

class mvpa2.datasets.sources.openfmri.OpenFMRIDataset(basedir)¶

Handler for datasets following the openfmri.org layout specifications

At present, this handler provides functions to query and access a number of dataset properties, BOLD images of individual acquisition runs, build datasets from individual BOLD images, and load stimulation design specifications for individual runs.

Methods

`get_anatomy_image`(subj[, path, fname])	Return a NiBabel image instance for a structural image of a subject.
`get_bold_run_dataset`(subj, task, run[, ...])	Return a dataset instance for the BOLD data of a particular subject/task/run combination.
`get_bold_run_ids`(subj, task)	Return (sorted) list of run IDs for a given subject and task
`get_bold_run_image`(subj, task, run[, flavor])	Return a NiBabel image instance for the BOLD data of a particular subject/task/run combination.
`get_bold_run_model`(model, subj, run)	Return the stimulation design for a particular subject/task/run.
`get_bold_run_motion_estimates`(subj, task, run)	Return the volume-wise motion estimates for a particular BOLD run
`get_model_bold_dataset`(model_id, subj_id[, ...])	Build a PyMVPA dataset for a model defined in the OpenFMRI dataset
`get_model_conditions`(model)	Return a description of all conditions for a given model
`get_model_contrasts`(model)	Return a defined contrasts for a model
`get_model_descriptions`()	Return a dictionary with the models described in the dataset
`get_model_ids`()	Return a sorted list of integer IDs for all available models
`get_scan_properties`()	Return a dictionary with the scan properties listed in scan_key.txt
`get_subj_ids`()	Return a (sorted) list of IDs for all subjects in the dataset
`get_task_bold_attributes`(task, fname, loadfx)	Return data attributes for all BOLD data from a specific task.
`get_task_bold_run_ids`(task)	Return a dictionary with run IDs by subjects for a given task
`get_task_descriptions`()	Return a dictionary with the tasks defined in the dataset

Parameters:

basedir : path

Path to the dataset (i.e. the directory with the ‘sub*’ subdirectories).

Methods

`get_anatomy_image`(subj[, path, fname])	Return a NiBabel image instance for a structural image of a subject.
`get_bold_run_dataset`(subj, task, run[, ...])	Return a dataset instance for the BOLD data of a particular subject/task/run combination.
`get_bold_run_ids`(subj, task)	Return (sorted) list of run IDs for a given subject and task
`get_bold_run_image`(subj, task, run[, flavor])	Return a NiBabel image instance for the BOLD data of a particular subject/task/run combination.
`get_bold_run_model`(model, subj, run)	Return the stimulation design for a particular subject/task/run.
`get_bold_run_motion_estimates`(subj, task, run)	Return the volume-wise motion estimates for a particular BOLD run
`get_model_bold_dataset`(model_id, subj_id[, ...])	Build a PyMVPA dataset for a model defined in the OpenFMRI dataset
`get_model_conditions`(model)	Return a description of all conditions for a given model
`get_model_contrasts`(model)	Return a defined contrasts for a model
`get_model_descriptions`()	Return a dictionary with the models described in the dataset
`get_model_ids`()	Return a sorted list of integer IDs for all available models
`get_scan_properties`()	Return a dictionary with the scan properties listed in scan_key.txt
`get_subj_ids`()	Return a (sorted) list of IDs for all subjects in the dataset
`get_task_bold_attributes`(task, fname, loadfx)	Return data attributes for all BOLD data from a specific task.
`get_task_bold_run_ids`(task)	Return a dictionary with run IDs by subjects for a given task
`get_task_descriptions`()	Return a dictionary with the tasks defined in the dataset

get_anatomy_image(subj, path=None, fname='highres001.nii.gz')¶

Return a NiBabel image instance for a structural image of a subject.

Parameters:

subj : int

Subject identifier.

path : list or None

Path to the structural file within the anatomy/ tree.

fname : str

Access a particular anatomy data flavor via its filename (see dataset description). Defaults to the first T1-weighted image.

Returns:

NiBabel Nifti1Image

get_bold_run_dataset(subj, task, run, flavor=None, preproc_img=None, add_sa=None, **kwargs)¶

Return a dataset instance for the BOLD data of a particular subject/task/run combination.

This method support the same functionality as fmri_dataset(), while wrapping get_bold_run_image() to access the input fMRI data. Additional attributes, such as subject ID, task ID, and run ID are automatically stored as dataset sample attributes.

Parameters:

subj : int

Subject identifier.

task : int

Task ID (see task_key.txt)

run : int

Run ID.

flavor : None or str

BOLD data flavor to access (see dataset description). If flavor corresponds to an existing file in the respective task/run directory, it is assumed to be a stored dataset in HDF5 format and loaded via h5load() – otherwise datasets are constructed from NIfTI images.

preproc_img : callable or None

If not None, this callable will be called with the loaded source BOLD image instance as an argument before fmri_dataset() is executed. The callable must return an image instance.

add_sa: str or tuple(str)

Single or sequence of names of files in the respective BOLD directory containing additional samples attributes. At this time all formats supported by NumPy’s loadtxt() are supported. The number of lines in such a file needs to match the number of BOLD volumes. Each column is converted into a separate dataset sample attribute. The file name with a column index suffix is used to determine the attribute name.

**kwargs:

All additional arguments are passed on to fmri_dataset()

Returns:

Dataset

get_bold_run_ids(subj, task)¶

Return (sorted) list of run IDs for a given subject and task

Typically, run IDs are integer values, but string IDs are supported as well.

Parameters:

subj : int or str

Subject ID

task : int or str

Run ID

get_bold_run_image(subj, task, run, flavor=None)¶

Return a NiBabel image instance for the BOLD data of a particular subject/task/run combination.

Parameters:

subj : int

Subject identifier.

task : int

Task ID (see task_key.txt)

run : int

Run ID.

flavor : None or str

BOLD data flavor to access (see dataset description)

get_bold_run_model(model, subj, run)¶

Return the stimulation design for a particular subject/task/run.

Parameters:

model : int

Model identifier.

subj : int

Subject identifier.

run : int

Run ID.

Returns:

list

One item per event in the run. All items are dictionaries with the following keys: ‘condition’, ‘onset’, ‘duration’, ‘intensity’, ‘run’, ‘task’, ‘trial_idx’, ‘ctrial_idx’, where the first is a literal label, the last four are integer IDs, and the rest are typically floating point values. ‘onset_idx’ is the index of the event specification sorted by time across the entire run (typically corresponding to a trial index), ‘conset_idx’ is analog but contains the onset index per condition, i.e. the nth trial of the respective condition in a run.

get_bold_run_motion_estimates(subj, task, run, fname='bold_moest.txt')¶

Return the volume-wise motion estimates for a particular BOLD run

Parameters:

subj : int

Subject identifier.

task : int

Task ID (see task_key.txt)

run : int

Run ID.

fname : str

Filename.

Returns:

array

Array of floats – one row per fMRI volume, 6 columns (typically, the first three are translation X, Y, Z in mm and the last three rotation in deg)

get_model_bold_dataset(model_id, subj_id, run_ids=None, preproc_img=None, preproc_ds=None, modelfx=None, stack=True, flavor=None, mask=None, add_fa=None, add_sa=None, **kwargs)¶

Build a PyMVPA dataset for a model defined in the OpenFMRI dataset

Parameters:

model_id : int

Model ID.

subj_id : int or str or list

Integer, or string ID of the subject whose data shall be considered. Alternatively, a list of IDs can be given and data from all matching subjects will be loaded at once.

run_ids : list, optional

Run ids to be loaded. If None, all runs get loaded

preproc_img : callable or None

See get_bold_run_dataset() documentation

preproc_ds : callable or None

If not None, this callable will be called with each run bold dataset as an argument before modelfx is executed. The callable must return a dataset.

modelfx : callable or None

This callable will be called with each run dataset and the respective event list for each run as arguments, In addition all additional **kwargs of this method will be passed on to this callable. The callable must return a dataset. If None, assign_conditionlabels will be used as a default callable.

stack : boolean

Flag whether to stack all run datasets into a single dataset, or whether to return a list of datasets.

flavor

See get_bold_run_dataset() documentation

mask

See fmri_dataset() documentation.

add_fa

See fmri_dataset() documentation.

add_sa

See get_bold_run_dataset() documentation.

Returns:

Dataset or list

Depending on the stack argument either a single dataset or a list of datasets for all subject/task/run combinations relevant to the model will be returned. In the stacked case the dataset attributes of the returned dataset are taken from the first run dataset, and are assumed to be identical for all of them.

get_model_conditions(model)¶

Return a description of all conditions for a given model

Parameters:

model : int

Model identifier.

Returns:

list(dict)

A list of a model conditions is returned, where each item is a dictionary with keys id (numerical condition ID), task (numerical task ID for the task containing this condition), and name (the literal condition name). This information is returned in a list (instead of a dictionary), because the openfmri specification of model conditions contains no unique condition identifier. Conditions are only uniquely described by the combination of task and condition ID.

get_model_contrasts(model)¶

Return a defined contrasts for a model

Parameters:

model : int

Model identifier.

Returns:

dict(dict)

A dictionary is returned, where each key is a (numerical) task ID and each value is a dictionary with contrast labels (str) as keys and contrast vectors as values.

get_model_descriptions()¶

Return a dictionary with the models described in the dataset

Dictionary keys are integer model IDs, values are description strings.

Note that the return dictionary is not necessarily comprehensive. It only reflects the models described in model_key.txt. If a dataset is inconsistently described, get_model_ids() actually may discover more or less models in comparison to the avauilable model descriptions.

get_model_ids()¶: Return a sorted list of integer IDs for all available models

get_scan_properties()¶: Return a dictionary with the scan properties listed in scan_key.txt

get_subj_ids()¶

Return a (sorted) list of IDs for all subjects in the dataset

Standard numerical subject IDs a returned as integer values. All other types of IDs are returned as strings with the ‘sub’ prefix stripped.

get_task_bold_attributes(task, fname, loadfx, exclude_subjs=None)¶

Return data attributes for all BOLD data from a specific task.

This function can load arbitrary data from the directories where the relevant BOLD image files are stored. Data sources are described by specifying the file name containing the data in the BOLD directory, and by providing a function that returns the file content in array form. Optionally, data from specific subjects can be skipped.

For example, this function can be used to access motion estimates.

Parameters:

task : int

Task ID (see task_key.txt)

fname : str

Filename.

loadfx : functor

Function that can open the relevant files and return their content as an array. This function is called with the name of the data file as its only argument.

exclude_subjs : list or None

Optional list of subject IDs whose data shall be skipped.

Returns:

list(dict(array))

A list (one item per run) of dictionaries (one item per subject, key is subject ID) of arrays. Each array carries the information loaded from the respective files.

get_task_bold_run_ids(task)¶

Return a dictionary with run IDs by subjects for a given task

Dictionary keys are subject IDs, values are lists of run IDs.

get_task_descriptions()¶

Return a dictionary with the tasks defined in the dataset

Dictionary keys are integer task IDs, values are task description strings.