GitHub - nikita-petrashen/openmax

This repo contains my implementation of the OpenMax module (ref). I reused some code from here.

Differences from other implementations I have found out there:

This code can be easily plugged in your pipeline.
This repo is documented.
This implementation is optimized (vectorized where it was possible).

What's it for?

OpenMax is a method for Open Set Classification (you have K known classes and a set of classes the model hasn't seen during training which you have to classify as "unknown").
OpenMax is applied only during inference and on top of a network which was trained with SoftMax, for example.
Its principle is based on the Extreme Value Theory.
The main idea is to build a probability model which would estimate how probable is that the sample belongs to one of the known classes based on the distance from the sample's embedding to the class centroids which are estimated from the train data. Based on this probability model OpenMax recalibrates the test samples' logits, also adding K+1-th score which accounts for the "unknown" class.

How OpenMax works

Phase 1: Weibull Fitting on the train data

Input: embeddings of correctly classified samples for each class.

Compute centroids for each class
For each sample, compute the distance between its embedding and the respective centroid
Take k farthest samples for each class
For each class, fit a Weibull distribution on these k largest distances

Phase 2: Logits recalibration for the test data

Input: embeddings and logits of the test samples, precomputed Weibull models and class centroids.

Compute the distance between each sample and each class centroid
Take alpha closest centroids for each sample
Based on these alpha closest centroids, for each sample recalibrate its logits according to the respective Weibull models
Compute probabilities as a SoftMax over the recalibrated logits
Classify a sample as "unknown" either if the respective probability is the largest or all of the probabilities fall below a threshold

For more details please see the paper.

How to select a threshold

Threshold value may be selected so that a certain percent (99%, for example) of the train set is classified as "known".

Note

I believe it is not stated clearly in the paper, but you can use embeddings from any other layer than from the penultimate one to fit the Weibull models. For example, the layer before the penultimate layer is a good choice for this, as embeddings in this layer are trained to be linearly separable and it sort of makes sense.

On the usage of LibMR

In my experiments I have found ot that the results of fitting with LibMR differ from the results obtained with this implementation. Based on my own judgement I decided to use the latter (cdf of LibMR fitting looked less plausible for me).

Usage

See example.ipynb for an example of use. Replace the toy data with the outputs of your model and here you go.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
openmax		openmax
README.md		README.md
example.ipynb		example.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

openmax

openmax

README.md

README.md

example.ipynb

example.ipynb

Repository files navigation

What's it for?

How OpenMax works

Phase 1: Weibull Fitting on the train data

Phase 2: Logits recalibration for the test data

How to select a threshold

Note

On the usage of LibMR

Usage

About

Releases

Packages

Languages

nikita-petrashen/openmax

Folders and files

Latest commit

History

Repository files navigation

What's it for?

How OpenMax works

Phase 1: Weibull Fitting on the train data

Phase 2: Logits recalibration for the test data

How to select a threshold

Note

On the usage of LibMR

Usage

About

Topics

Resources

Stars

Watchers

Forks

Languages