Miscellaneous¶
Import helper for PyMVPA misc modules
Managing (Custom) Configurations¶
PyMVPA provides a facility to handle arbitrary configuration settings. This facility can be used to control some aspects of the behavior of PyMVPA itself, as well as to store and query custom configuration items, e.g. to control one’s own analysis scripts.
An instance of this configuration manager is loaded whenever the mvpa2
module
is imported. It can be used from any script like this:
>>> from mvpa2 import cfg
By default the config manager reads settings from two config files (if any of
them exists). The first is a file named pymvpa2.cfg
and located in the
user’s home directory. The second is pymvpa2.cfg
in the current directory.
Please note, that settings found in the second file override the ones in the
first.
The syntax of both files is the one also known from the Windows INI files. Basically, Python’s ConfigParser is used to read those file and the config supports whatever this parser can read. A minimal example config file might look like this:
[general]
verbose = 1
It consists of a section general
containing a single setting verbose
,
which is set to 1
. PyMVPA recognizes a number of such sections and
configuration variables. A full list is shown at the end of this section and
is also available in the source package (doc/examples/pymvpa2.cfg
).
In addition to configuration files, the config manager also looks for special
environment variables to read settings from. Names of such variables have to
start with MVPA_
following by the an optional section name and the variable
name itself (with _
as delimiter). If no section name is provided, the
variables will be associated with section general
. Some examples:
MVPA_VERBOSE=1
will become:
[general]
verbose = 1
However, MVPA_VERBOSE_OUTPUT
= stdout
becomes:
[verbose]
output = stdout
Any lenght of variable name is allowed, e.g. MVPA_SEC1_LONG_VARIABLE_NAME=1
becomes:
[sec1]
long variable name = 1
Settings read from environment variables have the highest priority and override settings found in the config files. Therefore environment variables can be used to quickly adjust some setting without having to edit the config files.
The config manager can easily be queried from inside scripts. In addition to the interface of Python’s ConfigParser it has a few convenience functions mostly to allow for a default value in case no setting was found. For example:
>>> cfg.getboolean('warnings', 'suppress', default=False)
True
queries the config manager whether warnings should be suppressed (i.e. if
there is a variable suppress
in section warnings
). In case, there is no
such setting, i.e. neither config files nor environment variables defined it,
the default
values is returned. Please see the documentation of
ConfigManager for its full functionality.
The source tarballs includes an example configuration file
(doc/examples/pymvpa2.cfg
) with the comprehensive list of settings recognized
by PyMVPA itself:
### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ##
#
# Example configuration file to be used with PyMVPA
#
#
# See COPYING file distributed along with the PyMVPA package for the
# copyright and license terms.
#
### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ##
# This is a comprehensive list of all settings currently recognized by PyMVPA.
# Users can add arbitrary additional settings, both in new and already existing
# sections.
[general]
#debug =
#verbose =
#seed = 12345
[verbose]
# causes the output of __str__ to be truncated to the given number of
# characters
truncate str = 200
# causes the output of __repr__ to be truncated to the given number of
# characters
truncate repr = 200
#
# XXX The previous verbose section and verbose() should better be called info()
# to not collide with the concept of verbosity. We would have messages of
# types: debug, info, warning, error (familiar concepts that all can be
# subsumed as verbosity.
# comma-separated list of handlers, e.g. stdout
#output =
[error]
#output =
[warnings]
# integer
#bt =
# integer
#count =
# comma-separated list of handlers, e.g. stdout
#output =
# Boolean (former: MVPA_NO_WARNINGS)
suppress = no
[debug]
# comma-separated list of handlers, e.g. stdout
#output =
#metrics =
# either to use custom (improved) exception handler to report
# information about pymvpa useful during bug reporting
#wtf = no
#cmdline = no
[examples]
interactive = yes
[svm]
# which SVM implementation to use by default: libsvm or shogun
backend = libsvm
[matplotlib]
# override the default matplotlib's backend
# backend = pdf
[rpy]
# to prevent stalled exectution of PyMVPA upon problems in R
# session of R is always responding '1' whenever R asks for input.
# 1 corresponds to "abort (with core dump, if enabled)".
# Unfortunately such callback does not work reliably, thus disabled
# by default
interactive = yes
# Control over warnings spit out by R modules. From help(options) If
# 'warn' is negative all warnings are ignored. If 'warn' is zero
# (the default) warnings are stored until the top-level function
# returns. ... If 'warn' is one, warnings are printed as they occur.
# If 'warn' is two or larger all warnings are turned into errors.
# By default we want no warnings
warn = -1
[externals]
# whether to really raise an exception when an externals test fails _and_
# raising an exception was requested
raise exception = True
# whether to issue warning when an externals test fails _and_
# issuing a warning was requested
issue warning = True
# whether to retest the availability of an external dependency, depite an
# already present (but possibly outdated) test result
retest = no
# options starting with 'have ' indicate the presence or absence of external
# dependencies
#have scipy = no
[tests]
# whether to perform tests where the outcome is not deterministic
labile = yes
# if enabled, the unit tests will not run multiple classifiers on the same
# test, which reduces the time to run a full test significantly.
quick = no
# if enabled, unit tests consuming lots of memory will not automatically run
# as part of the main unittest battery
lowmem = no
# verbosity level of the unittest runner
verbosity = 1
# scale SNR of simulated data more than 1 to reduce failures of labile tests
snr scale = 1.0
[doc]
# whether to enhance the docstrings with base class and state information
pimp docstrings = yes
[data]
# root directory where datasets from pymvpa.org reside. By default this is going
# to be a directory 'data' in the installation path of PyMVPA
#root =
[datasets]
# repr by default prints a complete content of the Dataset so it could
# be inspected or stored as a string. For large datasets it might be
# an overwhelming amount of textual information, so possible options are possible
# full -- default, entire content; str -- use __str__ for __repr__.
# Option is in effect at import time, i.e. change of it wouldn't effect after dataset
# has already being loaded
repr = full
Progress Tracking¶
There are 3 types of messages PyMVPA can produce:
- verbose
- regular informative messages about generic actions being performed
- debug
- messages about the progress of computation, manipulation on data structures
- warning
- messages which are reported by mvpa if something goes a little unexpected but not critical
Redirecting Output¶
By default, all types of messages are printed by PyMVPA to the standard
output. It is possible to redirect them to standard error, or a file, or a
list of multiple such targets, by using environment variable
MVPA_?_OUTPUT
, where X is either VERBOSE
, DEBUG
, or WARNING
correspondingly. E.g.:
export MVPA_VERBOSE_OUTPUT=stdout,/tmp/1 MVPA_WARNING_OUTPUT=/tmp/3 MVPA_DEBUG_OUTPUT=stderr,/tmp/2
would direct verbose messages to standard output as well as to /tmp/1
file, warnings will be stored only in /tmp/3
, and debug output would
appear on standard error output, as well as in the file /tmp/2
.
PyMVPA output redirection though has no effect on external libraries debug output if corresponding debug target is enabled
- shogun
- debug output (if any of internal
SG_
debug targets is enabled) appears on standard output - SMLR
- debug output (if
SMLR_
debug target is enabled) appears on standard output - LIBSVM
- debug output (if
LIBSVM
debug target is enabled) appears on standard error
One of the possible redirections is Python’s StringIO
class. Instance of
such class can be added to the handlers
and queried later on for the
information to be dumped to a file later on. It is useful if output path is
specified at run time, thus it is impossible to redirect verbose or debug from
the start of the program:
>>> import sys
>>> from mvpa2.base import verbose
>>> from StringIO import StringIO
>>> stringout = StringIO()
>>> verbose.handlers = [sys.stdout, stringout]
>>> verbose.level = 3
>>>
>>> verbose(1, 'msg1')
msg1
>>> out_prefix='/tmp/'
>>>
>>> verbose(2, 'msg2')
msg2
>>> # open('%sverbose.log' % out_prefix, 'w').write(stringout.getvalue())
>>> print stringout.getvalue(),
msg1
msg2
>>>
Verbose Messages¶
Primarily for a user of PyMVPA to provide information about the progress of their scripts. Such messages are printed out if their level specified as the first parameter to verbose function call is less than specified. There are two easy ways to specify verbosity level:
- command line: you can use opt.verbose for precrafted command line option for to give facility to change it from your script (see examples)
- environment variable
MVPA_VERBOSE
- code: verbose.level property
The following verbosity levels are supported:
0: | nothing besides errors |
---|---|
1: | high level stuff – top level operation or file operations |
2: | cmdline handling |
3: | n.a. |
4: | computation/algorithm relevant thing |
Warning Messages¶
Reported by PyMVPA if something goes a little unexpected but not critical. By default they are printed just once per occasion, i.e. once per piece of code where it is called. Following environment variables control the behavior of warnings:
MVPA_WARNINGS_COUNT
=
controls for how many invocations of specific warning it gets printed (default behavior is 1 for once). Specification of negative count results in all invocations being printed, and value of 0 obviously suppresses the warningsMVPA_WARNINGS_SUPPRESS
analogous toMVPA_WARNINGS_COUNT
=0
it resultant behaviorMVPA_WARNINGS_BT
=
controls up to how many lines of traceback is printed for the warnings
In python code, invocation of warning with argument bt = True
enforces printout of traceback whenever warning tracebacks are
disabled by default.
Debug Messages¶
Debug messages are used to track progress of any computation inside
PyMVPA while the code run by python without optimization (i.e. without
-O
switch to python). They are specified not by the level but by
some id usually specific for a particular PyMVPA routine. For example
RFEC
id causes debugging information about Recursive Feature
Elimination call to be printed (See base module sources for the
list of all ids, or print debug.registered
property).
Analogous to verbosity level there are two easy ways to specify set of ids to be enabled (reported):
- command line: you can use optDebug for precrafted command line
option to provide it from your script (see examples). If in command
line if optDebug is used,
-d list
is given, PyMVPA will print out list of known ids. - environment: variable
MVPA_DEBUG
can contain comma-separated list of ids or python regular expressions to match multiple ids. Thus specifyingMVPA_DEBUG
=CLF.*
would enable all ids which start withCLF
, andMVPA_DEBUG
=.*
would enable all known ids. - code: debug.active property (e.g.
debug.active = [ 'RFEC', 'CLF' ]
)
Besides printing debug messages, it is also possible to print some metric. You can define new metrics or select predefined ones:
- vmem
- (Linux specific): amount of virtual memory consumed by the task
- pid
- (Linux specific): PID of the process
- reltime
- How many seconds passed since previous debug printout
- asctime
- Time stamp
- tb
- Traceback (
module1:line_number1[,line_number2...]>module2:line_number..
) where this debug statement was requested - tbc
- Concise traceback printout – prefix common with the previous
invocation is replaced with
...
To enable list of metrics you can use MVPA_DEBUG_METRICS
environment
variable to list desired metric names comma-separated. If ALL
is provided,
it enables all the metrics.
As it was mentioned earlier, debug messages are printed only in non-optimized python invocation. That was done to eliminate any slowdown introduced by such ‘debugging’ output, which might appear at some computational bottleneck places in the code.
Some of the debug ids are defined to facilitate additional checking of the
validity of the analysis. Their debug ids a prefixed by
CHECK_
. E.g. CHECK_RETRAIN
id would cause additional checking of the
data in retraining phase. Such additional testing might spot out some bugs in
the internal logic, thus enabled when full test suite is ran.
PyMVPA Status Summary¶
While reporting found bugs, it is advised to provide information about the
operating system/environment and availability of PyMVPA externals. Please use
wtf()
to collect such useful information to be included
with the bug reports.
Alternatively, same printout can be obtained upon not handled exception
automagically, if environment variable MVPA_DEBUG_WTF
is set.
Additional Little Helpers¶
Random Number Generation¶
To facilitate reproducible troubleshooting, a seed value of random generator
of NumPy can be provided in debug mode (python is called without -O
) via
environment variable MVPA_SEED
=
. Otherwise it gets seeded with random
integer which can be displayed with debug id RANDOM
e.g.:
> MVPA_SEED=123 MVPA_DEBUG=RANDOM python test_clf.py
[RANDOM] DBG: Seeding RNG with 123
...
> MVPA_DEBUG=RANDOM python test_clf.py
[RANDOM] DBG: Seeding RNG with 1447286079
...
Unittests at a Grasp¶
If it is needed to just quickly grasp through all unittests without making
them to test multiple classifiers (implemented with sweeparg), define
environmental variable MVPA_TESTS_QUICK
e.g.:
> MVPA_WARNINGS_SUPPRESS=no MVPA_TESTS_QUICK=yes python test_clf.py
...............
----------------------------------------------------------------------
Ran 15 tests in 0.845s
Some tests are not 100% deterministic as they operate on random data (e.g.
the performance of a randomly initialized classifier). Therefore, in some cases,
specific unit tests might fail when running the full test battery. To exclude
these test cases (and only those where non-deterministic behavior immanent) one
can use the MVPA_TESTS_LABILE
configuration and set it to ‘off’.
FSL Bindings¶
PyMVPA contains a few little helpers to make interfacing with FSL easier. The purpose of these helpers is to increase the efficiency when doing an analysis by (re)using useful information that is already available from some FSL output. FSL usually stores most interesting information in the NIfTI format. Therefore it can be easily imported into PyMVPA using NiBabel. However, some information is stored in text files, e.g. estimated motion correction parameters and FEAT’s three-column custom EV files. PyMVPA provides import and export helpers for both of them (among other stuff like a MELODIC results import helper).
Here is an example how the McFlirt parameter output can be used to perform motion-aware data detrending:
>>> from os import path
>>> import numpy as np
>>>
>>> # some dummy dataset
>>> from mvpa2.datasets import Dataset
>>> ds = Dataset(samples=np.random.normal(size=(19, 3)))
>>>
>>> # load motion correction output
>>> from mvpa2.misc.fsl.base import McFlirtParams
>>> mc = McFlirtParams(path.join('mvpa2', 'data', 'bold_mc.par'))
>>>
>>> # simple plot using pylab (use pylab.show() or pylab.savefig()
>>> # afterwards)
>>> mc.plot()
>>>
>>> # merge the correction parameters into the dataset itself
>>> for param in mc:
... ds.sa['mc_' + param] = mc[param]
>>>
>>> # detrend some dataset with mc params as additonal regressors
>>> from mvpa2.mappers.detrend import poly_detrend
>>> poly_detrend(ds, opt_regs=['mc_x', 'mc_y', 'mc_z',
... 'mc_rot1', 'mc_rot2', 'mc_rot3'])
All FSL bindings are located in the mvpa2.misc.fsl module.