mvpa2.clfs.transerror.BayesConfusionHypothesis¶
-
class
mvpa2.clfs.transerror.
BayesConfusionHypothesis
(alpha=None, labels_attr='predictions', space='hypothesis', prior_Hs=None, log=True, postprob=True, hypotheses=None, **kwargs)¶ Bayesian hypothesis testing on confusion matrices.
For multi-class classification a single accuracy value is often not a meaningful performance measure – or at least hard to interpret. This class allows for convenient Bayesian hypothesis testing of confusion matrices. It computes the likelihood of discriminibility of any partitions of classes given a confusion matrix.
The returned dataset contains at least one feature (the log likelihood of a hypothesis) and as many samples as (possible) partitions of classes. The actual partition configurations are stored in a sample attribute of nested lists. The top-level list contains discriminable groups of classes, whereas the second level lists contain groups of classes that cannot be discriminated under a given hypothesis. For example:
[[0, 1], [2], [3, 4, 5]]
This hypothesis represent the state where class 0 and 1 cannot be distinguished from each other, but both 0 and 1 together can be distinguished from class 2 and the group of 3, 4, and 5 – where classes from the later group cannot be distinguished from one another.
This algorithms is based on
Olivetti, E., Greiner, S. and Avesani, P. (2012). Testing for Information with Brain Decoding. In: Pattern Recognition in NeuroImaging (PRNI), International Workshop on.Notes
Available conditional attributes:
calling_time+
: Time (in seconds) it took to call the noderaw_results
: Computed results before invoking postproc. Stored only if postproc is not None.
(Conditional attributes enabled by default suffixed with
+
)Attributes
descr
Description of the object if any pass_attr
Which attributes of the dataset or self.ca to pass into result dataset upon call postproc
Node to perform post-processing of results space
Processing space name of this node Methods
__call__
(ds[, _call_kwargs])The default implementation calls _precall()
,_call()
, and finally returns the output of_postcall()
.generate
(ds)Yield processing results. get_postproc
()Returns the post-processing node or None. get_space
()Query the processing space name of this node. reset
()set_postproc
(node)Assigns a post-processing node set_space
(name)Set the processing space name of this node. Parameters: alpha : array
Bayesian hyper-prior alpha (in a multivariate-Dirichlet sense)
labels_attr : str
Name of the sample attribute in the input dataset that contains the class labels corresponding to the confusion matrix rows. If an attribute with this name is not found, hypotheses will be reported based on confusion table row/column numbers, instead of their corresponding labels. If such an attribute is found in the input dataset, any
hypotheses
specification has to be specified using literal labels also.space : str
Name of the sample attribute in the output dataset where the hypothesis partition configurations will be stored.
prior_Hs : array
Vector of priors for each hypotheses. Typically used in conjunction with an explicit set of possible hypotheses (see
hypotheses
). IfNone
a flat prior is assumed.log : bool
Whether to return values (likelihood or posterior probabilities) in log scale to mitigate numerical precision problems with near-zero probabilities.
postprob : bool
Whether to return posterior probabilities p(hypothesis|confusion) instead of likelihood(confusion|hypothesis).
hypotheses : list
List of possible hypotheses. XXX needs work on how to specify them.
enable_ca : None or list of str
Names of the conditional attributes which should be enabled in addition to the default ones
disable_ca : None or list of str
Names of the conditional attributes which should be disabled
pass_attr : str, list of str|tuple, optional
Additional attributes to pass on to an output dataset. Attributes can be taken from all three attribute collections of an input dataset (sa, fa, a – see
Dataset.get_attr()
), or from the collection of conditional attributes (ca) of a node instance. Corresponding collection name prefixes should be used to identify attributes, e.g. ‘ca.null_prob’ for the conditional attribute ‘null_prob’, or ‘fa.stats’ for the feature attribute stats. In addition to a plain attribute identifier it is possible to use a tuple to trigger more complex operations. The first tuple element is the attribute identifier, as described before. The second element is the name of the target attribute collection (sa, fa, or a). The third element is the axis number of a multidimensional array that shall be swapped with the current first axis. The fourth element is a new name that shall be used for an attribute in the output dataset. Example: (‘ca.null_prob’, ‘fa’, 1, ‘pvalues’) will take the conditional attribute ‘null_prob’ and store it as a feature attribute ‘pvalues’, while swapping the first and second axes. Simplified instructions can be given by leaving out consecutive tuple elements starting from the end.postproc : Node instance, optional
Node to perform post-processing of results. This node is applied in
__call__()
to perform a final processing step on the to be result dataset. If None, nothing is done.descr : str
Description of the instance
Attributes
descr
Description of the object if any pass_attr
Which attributes of the dataset or self.ca to pass into result dataset upon call postproc
Node to perform post-processing of results space
Processing space name of this node Methods
__call__
(ds[, _call_kwargs])The default implementation calls _precall()
,_call()
, and finally returns the output of_postcall()
.generate
(ds)Yield processing results. get_postproc
()Returns the post-processing node or None. get_space
()Query the processing space name of this node. reset
()set_postproc
(node)Assigns a post-processing node set_space
(name)Set the processing space name of this node.