An Overview of SIDEKIT

SIDEKIT aims at providing the whole chain of tools required to perform speaker recognition.
The main tools available include:
  • Acoustic features extraction

    • Linear-Frequency Cepstral Coefficients (LFCC)

    • Mel-Frequency Cepstral Coefficients (MFCC)

    • RASTA filtering

    • Energy-based Voice Activity Detection (VAD)

    • normalization (CMS, CMVN, Short Term Gaussianization)

  • Modeling and classification

    • Gaussian Mixture Models (GMM)

    • i - vectors

    • Probabilistic Linear Discriminant Analysis (PLDA)

    • Joint Factor Analysis (JFA)

    • Support Vector Machine (SVM)

    • Deep Neural Network (bridge to THEANO)

  • Presentation of the results
    • DET plot

    • ROC Convex Hull based DET plot

Implementation

SIDEKIT has been designed and written in Python and released under LGPL License
to allow a wider usage of the code that, we hope, could be beneficial to the community.
The structure of the core package makes use of a limited number of classes in order
to facilitate the readability and reusability of the code.
Starting from version 1.1.0 SIDEKIT is no longer tested under Python 2.*
SIDEKIT has been tested under Python 3.7 for both Linux and MacOS.

About SIDEKIT

Authors

Anthony Larcher & Kong Aik Lee & Sylvain Meignier

Version

1.3.1 of 2019/01/22

To know about the version and license of SIDEKIT

sidekit.__version__
sidekit.__license__