Full-scale fMRI data analysis using pattern classificationΒΆ
This script demonstrates a complete classification analysis as it could be found in actual research projects. It is a representative example of the functionality accessible from the command line interface – when it was originally release with PyMVPA 2.3. We start by creating a dataset as a collection of information from various sources. fMRI data is loaded from a 4D NIfTI image, while only voxel with non-zero value in a mask image are kept. Each volume is associated with some attribute values that are read from a text file. In addition, a number of mask additional mask images are included as feature attributes, in order to be able to conveniently group voxel/features into ROIs. Lastly, motion estimates are included as per-volume attributes. The resulting dataset is stored in a compressed HDF5 file.
pymvpa2 mkds \
--openfmri-modelbold "$dataroot" 1 1 '25mm' \
--mask "$dataroot"/sub001/masks/25mm/brain.nii.gz \
--add-sa-attr "$dataroot"/../attributes_literal.txt \
--add-vol-attr hoc "$dataroot"/sub001/masks/25mm/hoc.nii.gz \
--add-vol-attr gm "$dataroot"/sub001/masks/25mm/gray.nii.gz \
--add-vol-attr vt "$dataroot"/sub001/masks/25mm/vt.nii.gz \
--add-fsl-mcpar bold_moest.txt \
--hdf5-compression gzip \
-o "$outdir"/bold.hdf5
The describe command generates a terse summary of the dataset.
pymvpa2 describe -i "$outdir"/bold.hdf5
One of the additional feature attributed was down-sampled and aligned
brain parcelation of the Harvard-Oxford cortical atlas. The dump
command
can be used to extract any dataset component and convert it into a variety
of formats – including plain text – for further processing with standard
UNIX console tools.
echo "Number of ROIs in the Harvard-Oxford cortial atlas: "
pymvpa2 dump --fa hoc -f txt -i "$outdir"/bold.hdf5 | sort | uniq | wc -l
The preproc
command offers a few selected preprocessing procedures. Here
we first regress out the motion parameter estimate time-courses that were
included as sample attributes in the dataset. And finally, we perform spectral
filtering using a Butterworth with selected pass and stop bands.
As with every processing step, the result is stored as a dataset in an HDF5
file.
pymvpa2 preproc --poly-detrend 0 \
--detrend-regrs bold_moest.txt_0 bold_moest.txt_1 bold_moest.txt_2 bold_moest.txt_3 bold_moest.txt_4 bold_moest.txt_5 \
--filter-passband 0.005 0.067 \
--filter-stopband 0.0025 0.1 \
--sampling-rate 0.4 \
--chunks chunks \
--hdf5-compression gzip \
-o "$outdir"/bold_mcf.hdf5 \
-i "$outdir"/bold.hdf5
For the subsequent classification analysis we are only interested in two out of all available experimental conditions, and only in voxels that are part of the “VT” mask. Sub-selection of dataset content is supported by indices, as well as a simple expressions.
pymvpa2 select \
--samples-by-attr targets eq face or targets eq house \
--features-by-attr vt gt 0 \
-i "$outdir"/bold_mcf.hdf5 \
-o "$outdir"/faceshouses_inVT.hdf5
The easiest way to perform a classification analysis is the selection of a
pre-crafted classifier instance from the “warehouse”. A variety of data
partitioning schemes is available – we select “leave-one-out” – by default
operating on the chunks
sample attribute of the dataset. Selection of error
functions is possible by name, or by providing a Python script with a custom
implementation. Many commands and options of the command line interface can be
fed and extended with custom Python scripts.
pymvpa2 crossval \
--learner "$clf" \
--partitioner n-1 \
--errorfx mean_match_accuracy \
--avg-datafold-results \
-i "$outdir"/faceshouses_inVT.hdf5 \
-o "$outdir"/xval_faces_vs_houses_inVT.hdf5
After this first ROI-based classification analysis, we are now aiming for a very similar classification analysis that, in contrast, is done in a “searchlight” – a traveling ROI analysis throught the entire brain. Hence we create a new dataset, again with only face and house data samples, but this time including all voxels.
pymvpa2 select \
--samples-by-attr targets eq face or targets eq house \
--hdf5-compression gzip \
-i "$outdir"/bold_mcf.hdf5 \
-o "$outdir"/faceshouses_brain.hdf5
The searchlight
command can be used to compute arbitrary metric in this
fashion, but has built-in support for cross-validated classification analyses
(--payload
). Spherical ROI with a radius of 4 voxels (--neighbors
) will
be generated, centered on gray-matter voxels only (--roi-attr
). The
computation will be parallelized with up to two concurrent processes.
[Note: The --scatter-rois
option is only present to speed up computation
and can be removed in order to obtain a dense result map.]
pymvpa2 --dbg-channel SLC searchlight \
--payload cv \
--neighbors 4 \
--scatter-rois 5 \
--roi-attr gm \
--nproc 2 \
--cv-learner "$clf" \
--cv-partitioner oddeven:chunks \
--cv-errorfx mean_match_accuracy \
--cv-avg-datafold-results \
--cv-permutations 2 \
--hdf5-compression gzip \
-i "$outdir"/faceshouses_brain.hdf5 \
-o "$outdir"/sl_faces_vs_houses_brain.hdf5
Using the dump
command, results can be stored in various formats,
including NIfTI. Saving as NIfTI automatically takes care of projecting
back results into the 3D voxel space.
pymvpa2 dump -s \
-f nifti \
-i "$outdir"/sl_faces_vs_houses_brain.hdf5 \
-o "$outdir"/sl_faces_vs_houses_brain_ACC.nii.gz
pymvpa2 dump --fa null_prob \
-f nifti \
-i "$outdir"/sl_faces_vs_houses_brain.hdf5 \
-o "$outdir"/sl_faces_vs_houses_brain_NP.nii.gz
An alternative to a searchlight with its often arbitrary ROI shapes and boundaries is an iterative ROI analysis – cycling through a number of ROIs that are defined by localizers or an atlas. Here we perform the cross-validated classification analysis shown above on all areas defined in the Harvard-Oxford cortical atlas and present in our data.
hoc_rois=( $(pymvpa2 exec -i "$outdir"/bold.hdf5 -e 'print(" ".join(map(str, dss[0].fa["hoc"].unique)))') )
echo "ROIs of the Harvard-Oxford cortial atlas present in the data: ${hoc_rois[*]}"
for roi in ${hoc_rois[*]}; do
echo -en " ROI $roi\t"
Select corresponding voxels.
pymvpa2 select --features-by-attr hoc eq $roi \
-i "$outdir"/faceshouses_brain.hdf5 \
-o "$outdir"/roi_tmp.hdf5
Report number of voxels present in the given ROI.
nfeatures=$(pymvpa2 exec -i "$outdir"/roi_tmp.hdf5 -e "print(dss[0].nfeatures)")
resultds="${outdir}/xval_faces_vs_houses_inROI${roi}.hdf5"
echo -en "$nfeatures voxels\taccuracy="
And run the cross validation, finally printing the overall accuracy as a result on the console.
pymvpa2 crossval \
--learner "$clf" \
--partitioner n-1 \
--errorfx mean_match_accuracy \
--avg-datafold-results \
-i "$outdir"/roi_tmp.hdf5 \
-o $resultds | awk '/ACC%/{printf "%.2f%%\n", $2}'
[ -z "${MVPA_TESTS_QUICK:-}" ] || break # reserved for testing
done
See also
The full source code of this example is included in the PyMVPA source distribution (doc/examples/cmdline/fmri_analyses.sh
).