Mixture¶

class
mixture.
Mixture
(mixture_file_name='', name='empty')[source]¶ A class for Gaussian Mixture Model storage. For more details about Gaussian Mixture Models (GMM) you can refer to [Bimbot04].
 Attr w
array of weight parameters
 Attr mu
ndarray of mean parameters, each line is one distribution
 Attr invcov
ndarray of inverse covariance parameters, 2dimensional for diagonal covariance distribution 3dimensional for full covariance
 Attr invchol
3dimensional ndarray containing upper cholesky decomposition of the inverse covariance matrices
 Attr cst
array of constant computed for each distribution
 Attr det
array of determinant for each distribution

EM_diag2full
(diagonal_mixture, features_server, featureList, iterations=2, num_thread=1)[source]¶ ExpectationMaximization estimation of the Mixture parameters.
 Parameters
features_server – sidekit.FeaturesServer used to load data
featureList – list of feature files to train the GMM
iterations – list of iteration number for each step of the learning process
num_thread – number of thread to launch for parallel computing
 Return llk
a list of loglikelihoods obtained after each iteration

EM_split
(features_server, feature_list, distrib_nb, iterations=1, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, num_thread=1, llk_gain=0.01, save_partial=False, output_file_name='ubm', ceil_cov=10, floor_cov=0.01)[source]¶ ExpectationMaximization estimation of the Mixture parameters.
 Parameters
features_server – sidekit.FeaturesServer used to load data
feature_list – list of feature files to train the GMM
distrib_nb – final number of distributions
iterations – list of iteration number for each step of the learning process
num_thread – number of thread to launch for parallel computing
llk_gain – limit of the training gain. Stop the training when gain between two iterations is less than this value
save_partial – name of the file to save intermediate mixtures, if True, save before each split of the distributions
ceil_cov –
floor_cov –
 Return llk
a list of loglikelihoods obtained after each iteration

EM_uniform
(cep, distrib_nb, iteration_min=3, iteration_max=10, llk_gain=0.01, do_init=True)[source]¶ ExpectationMaximization estimation of the Mixture parameters.
 Parameters
cep – set of feature frames to consider
cep – set of feature frames to consider
distrib_nb – number of distributions
iteration_min – minimum number of iterations to perform
iteration_max – maximum number of iterations to perform
llk_gain – gain in term of likelihood, stop the training when the gain is less than this value
do_init – boolean, if True initialize the GMM from the training data
 Return llk
a list of loglikelihoods obtained after each iteration

compute_log_posterior_probabilities
(cep, mu=None)[source]¶ Compute log posterior probabilities for a set of feature frames.
 Parameters
cep – a set of feature frames in a ndarray, one feature per row
mu – a mean supervector to replace the ubm’s one. If it is an empty vector, use the UBM
 Returns
A ndarray of logposterior probabilities corresponding to the input feature set.

compute_log_posterior_probabilities_full
(cep, mu=None)[source]¶ Compute log posterior probabilities for a set of feature frames.
 Parameters
cep – a set of feature frames in a ndarray, one feature per row
mu – a mean supervector to replace the ubm’s one. If it is an empty vector, use the UBM
 Returns
A ndarray of logposterior probabilities corresponding to the input feature set.

dim
()[source]¶ Return the dimension of distributions of the Mixture
 Returns
an integer, size of the acoustic vectors

distrib_nb
()[source]¶ Return the number of distribution of the Mixture
 Returns
the number of distribution in the Mixture

get_distrib_nb
()[source]¶ Return the number of Gaussian distributions in the mixture :return: then number of distributions

get_invcov_super_vector
()[source]¶ Return Inverse covariance supervector
 Returns
an array, supervector of the inverse covariance coefficients

get_mean_super_vector
()[source]¶ Return mean supervector
 Returns
an array, supervector of the mean coefficients

merge
(model_list)[source]¶ Merge a list of Mixtures into a new one. Weights are normalized uniformly :param model_list: a list of Mixture objects to merge

read
(mixture_file_name, prefix='')[source]¶ Read a Mixture in hdf5 format
 Parameters
mixture_file_name – name of the file to read from
prefix –

static
read_htk
(filename, begin_hmm=False, state2=False)[source]¶ Read a Mixture in HTK format
 Parameters
filename – name of the file to read from
begin_hmm – boolean
state2 – boolean

sv_size
()[source]¶ Return the dimension of the supervector
 Returns
an integer, size of the mean supervector