
class mixture.Mixture(mixture_file_name='', name='empty')[source]

A class for Gaussian Mixture Model storage. For more details about Gaussian Mixture Models (GMM) you can refer to [Bimbot04].

Attr w

array of weight parameters

Attr mu

ndarray of mean parameters, each line is one distribution

Attr invcov

ndarray of inverse co-variance parameters, 2-dimensional for diagonal co-variance distribution 3-dimensional for full co-variance

Attr invchol

3-dimensional ndarray containing upper cholesky decomposition of the inverse co-variance matrices

Attr cst

array of constant computed for each distribution

Attr det

array of determinant for each distribution

EM_diag2full(diagonal_mixture, features_server, featureList, iterations=2, num_thread=1)[source]

Expectation-Maximization estimation of the Mixture parameters.

  • features_server – sidekit.FeaturesServer used to load data

  • featureList – list of feature files to train the GMM

  • iterations – list of iteration number for each step of the learning process

  • num_thread – number of thread to launch for parallel computing

Return llk

a list of log-likelihoods obtained after each iteration

EM_split(features_server, feature_list, distrib_nb, iterations=1, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, num_thread=1, llk_gain=0.01, save_partial=False, output_file_name='ubm', ceil_cov=10, floor_cov=0.01)[source]

Expectation-Maximization estimation of the Mixture parameters.

  • features_server – sidekit.FeaturesServer used to load data

  • feature_list – list of feature files to train the GMM

  • distrib_nb – final number of distributions

  • iterations – list of iteration number for each step of the learning process

  • num_thread – number of thread to launch for parallel computing

  • llk_gain – limit of the training gain. Stop the training when gain between two iterations is less than this value

  • save_partial – name of the file to save intermediate mixtures, if True, save before each split of the distributions

  • ceil_cov

  • floor_cov

Return llk

a list of log-likelihoods obtained after each iteration

EM_uniform(cep, distrib_nb, iteration_min=3, iteration_max=10, llk_gain=0.01, do_init=True)[source]

Expectation-Maximization estimation of the Mixture parameters.

  • cep – set of feature frames to consider

  • cep – set of feature frames to consider

  • distrib_nb – number of distributions

  • iteration_min – minimum number of iterations to perform

  • iteration_max – maximum number of iterations to perform

  • llk_gain – gain in term of likelihood, stop the training when the gain is less than this value

  • do_init – boolean, if True initialize the GMM from the training data

Return llk

a list of log-likelihoods obtained after each iteration

compute_log_posterior_probabilities(cep, mu=None)[source]

Compute log posterior probabilities for a set of feature frames.

  • cep – a set of feature frames in a ndarray, one feature per row

  • mu – a mean super-vector to replace the ubm’s one. If it is an empty vector, use the UBM


A ndarray of log-posterior probabilities corresponding to the input feature set.

compute_log_posterior_probabilities_full(cep, mu=None)[source]

Compute log posterior probabilities for a set of feature frames.

  • cep – a set of feature frames in a ndarray, one feature per row

  • mu – a mean super-vector to replace the ubm’s one. If it is an empty vector, use the UBM


A ndarray of log-posterior probabilities corresponding to the input feature set.


Return the dimension of distributions of the Mixture


an integer, size of the acoustic vectors


Return the number of distribution of the Mixture


the number of distribution in the Mixture


Return the number of Gaussian distributions in the mixture :return: then number of distributions


Return Inverse covariance super-vector


an array, super-vector of the inverse co-variance coefficients


Return mean super-vector


an array, super-vector of the mean coefficients




Merge a list of Mixtures into a new one. Weights are normalized uniformly :param model_list: a list of Mixture objects to merge

read(mixture_file_name, prefix='')[source]

Read a Mixture in hdf5 format

  • mixture_file_name – name of the file to read from

  • prefix

static read_alize(file_name)[source]



static read_htk(filename, begin_hmm=False, state2=False)[source]

Read a Mixture in HTK format

  • filename – name of the file to read from

  • begin_hmm – boolean

  • state2 – boolean


Return the dimension of the super-vector


an integer, size of the mean super-vector


Verify the format of the Mixture


a boolean giving the status of the Mixture

static variance_control(cov, flooring, ceiling, cov_ctl)[source]

variance_control for Mixture (florring and ceiling)

  • cov – covariance to control

  • flooring – float, florring value

  • ceiling – float, ceiling value

  • cov_ctl – co-variance to consider for flooring and ceiling