Out-Of-Distribution (OOD) detection#
Starting from a trained a neural classifier, it’s possible to fit one of the models below to help distinguish between in-distribution and out of distribution inputs.
- class fortuna.ood_detection.mahalanobis.MalahanobisOODClassifier(*args, **kwargs)[source]#
The pre-trained features of a softmax neural classifier \(f(\mathbf{x})\) are assumed to follow a class-conditional gaussian distribution with a tied covariance matrix \(\mathbf{\Sigma}\):
\[\mathbb{P}(f(\mathbf{x})|y=k) = \mathcal{N}(f(\mathbf{x})|\mu_k, \mathbf{\Sigma})\]for all \(k \in {1,...,K}\), where \(K\) is the number of classes.
The confidence score \(M(\mathbf{x})\) for a new test sample \(\mathbf{x}\) is obtained computing the max (squared) Mahalanobis distance between \(f(\mathbf{x})\) and the fitted class-wise guassians.
- Parameters:
num_classes (int) – The number of classes for the in-distribution classification task.
- property cov#
- Returns:
The shared covariance matrix with shape (d, d), where d is the embedding size.
- Return type:
Array
- fit(embeddings, targets)[source]#
Fits a Multivariate Gaussian to the training data using class-specific means and a shared covariance matrix.
- Parameters:
embeddings (Array) – The embeddings of shape (n, d) where n is the number of training samples and d is the embbeding’s size.
targets (Array) – An array of length n containing, for each input sample, its ground-truth label.
- Return type:
None
- property mean#
- Returns:
A matrix of shape (num_classes, d), where num_classes is the number of classes in the in-distribution classification task. The ith row of the matrix corresponds to the mean of the fitted Gaussian distribution for the i-th class.
- Return type:
Array
- score(embeddings)[source]#
The confidence score \(M(\mathbf{x})\) for a new test sample \(\mathbf{x}\) is obtained computing the max (squared) Mahalanobis distance between \(f(\mathbf{x})\) and the fitted class-wise Guassians.
A high score signals that the test sample \(\mathbf{x}\) is identified as OOD.
- Parameters:
embeddings (Array) – The embeddings of shape (n, d) where n is the number of test samples and d is the embbeding’s size.
- Returns:
An array of scores with length n.
- Return type:
Array
- class fortuna.ood_detection.ddu.DeepDeterministicUncertaintyOODClassifier(*args, **kwargs)[source]#
A Gaussian Mixture Model \(q(\mathbf{x}, z)\) with a single Gaussian mixture component per class \(k \in {1,...,K}\) is fit after training. Each class component is fit computing the empirical mean \(\mathbf{\hat{\mu}_k}\) and covariance matrix \(\mathbf{\hat{\Sigma}_k}\) of the feature vectors \(f(\mathbf{x})\).
The confidence score \(M(\mathbf{x})\) for a new test sample is obtained computing the negative marginal likelihood of the feature representation.
- Parameters:
num_classes (int) – The number of classes for the in-distribution classification task.
- property cov#
- Returns:
The per-class covariance matrix of the fitted GMM. The shape of the array is (num_classes, d, d) where num_classes is the number of target classes in the in-distribution classification task and d is the embedding size.
- Return type:
Array
- fit(embeddings, targets)[source]#
Fits a Multivariate Gaussian to the training data using class-specific means and covariance matrix.
- Parameters:
embeddings (Array) – The embeddings of shape (n, d) where n is the number of training samples and d is the embbeding’s size.
targets (Array) – An array of length n containing, for each input sample, its ground-truth label.
- Return type:
None
- property mean: Union[Array, ndarray]#
- Returns:
The per-class mean vector of the fitted GMM. The shape of the array is (num_classes, d) where num_classes is the number of target classes in the in-distribution classification task and d is the embedding size.
- Return type:
Array
- score(embeddings)[source]#
The confidence score \(M(\mathbf{x})\) for a new test sample \(\mathbf{x}\) is obtained computing the negative marginal likelihood of the feature representation \(-q(f(\mathbf{x})) = - \sum\limits_{k}q(f(\mathbf{x})|y) q(y)\).
A high score signals that the test sample \(\mathbf{x}\) is identified as OOD.
- Parameters:
embeddings (Array) – The embeddings of shape (n, d) where n is the number of test samples and d is the embbeding’s size.
- Returns:
An array of scores with length n.
- Return type:
Array