Mixture¶

class mixture.Mixture(mixture_file_name='', name='empty')[source]¶

A class for Gaussian Mixture Model storage. For more details about Gaussian Mixture Models (GMM) you can refer to [Bimbot04].

Attr w: array of weight parameters
Attr mu: ndarray of mean parameters, each line is one distribution
Attr invcov: ndarray of inverse co-variance parameters, 2-dimensional for diagonal co-variance distribution 3-dimensional for full co-variance
Attr invchol: 3-dimensional ndarray containing upper cholesky decomposition of the inverse co-variance matrices
Attr cst: array of constant computed for each distribution
Attr det: array of determinant for each distribution

EM_diag2full(diagonal_mixture, features_server, featureList, iterations=2, num_thread=1)[source]¶

Expectation-Maximization estimation of the Mixture parameters.

Parameters

features_server – sidekit.FeaturesServer used to load data
featureList – list of feature files to train the GMM
iterations – list of iteration number for each step of the learning process
num_thread – number of thread to launch for parallel computing

Return llk

a list of log-likelihoods obtained after each iteration

EM_split(features_server, feature_list, distrib_nb, iterations=1, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, num_thread=1, llk_gain=0.01, save_partial=False, output_file_name='ubm', ceil_cov=10, floor_cov=0.01)[source]¶

Expectation-Maximization estimation of the Mixture parameters.

Parameters

features_server – sidekit.FeaturesServer used to load data
feature_list – list of feature files to train the GMM
distrib_nb – final number of distributions
iterations – list of iteration number for each step of the learning process
num_thread – number of thread to launch for parallel computing
llk_gain – limit of the training gain. Stop the training when gain between two iterations is less than this value
save_partial – name of the file to save intermediate mixtures, if True, save before each split of the distributions
ceil_cov –
floor_cov –

Return llk

a list of log-likelihoods obtained after each iteration

EM_uniform(cep, distrib_nb, iteration_min=3, iteration_max=10, llk_gain=0.01, do_init=True)[source]¶

Expectation-Maximization estimation of the Mixture parameters.

Parameters

cep – set of feature frames to consider
cep – set of feature frames to consider
distrib_nb – number of distributions
iteration_min – minimum number of iterations to perform
iteration_max – maximum number of iterations to perform
llk_gain – gain in term of likelihood, stop the training when the gain is less than this value
do_init – boolean, if True initialize the GMM from the training data

Return llk

a list of log-likelihoods obtained after each iteration

compute_log_posterior_probabilities(cep, mu=None)[source]¶

Compute log posterior probabilities for a set of feature frames.

Parameters

cep – a set of feature frames in a ndarray, one feature per row
mu – a mean super-vector to replace the ubm’s one. If it is an empty vector, use the UBM

Returns

A ndarray of log-posterior probabilities corresponding to the input feature set.

compute_log_posterior_probabilities_full(cep, mu=None)[source]¶

Compute log posterior probabilities for a set of feature frames.

Parameters

cep – a set of feature frames in a ndarray, one feature per row
mu – a mean super-vector to replace the ubm’s one. If it is an empty vector, use the UBM

Returns

A ndarray of log-posterior probabilities corresponding to the input feature set.

dim()[source]¶

Return the dimension of distributions of the Mixture

Returns: an integer, size of the acoustic vectors

distrib_nb()[source]¶

Return the number of distribution of the Mixture

Returns: the number of distribution in the Mixture

get_distrib_nb()[source]¶: Return the number of Gaussian distributions in the mixture :return: then number of distributions

get_invcov_super_vector()[source]¶

Return Inverse covariance super-vector

Returns: an array, super-vector of the inverse co-variance coefficients

get_mean_super_vector()[source]¶

Return mean super-vector

Returns: an array, super-vector of the mean coefficients

init_from_diag(diag_mixture)[source]¶

Parameters: diag_mixture –

merge(model_list)[source]¶: Merge a list of Mixtures into a new one. Weights are normalized uniformly :param model_list: a list of Mixture objects to merge

read(mixture_file_name, prefix='')[source]¶

Read a Mixture in hdf5 format

Parameters

mixture_file_name – name of the file to read from
prefix –

static read_alize(file_name)[source]¶

Parameters: file_name –
Returns

static read_htk(filename, begin_hmm=False, state2=False)[source]¶

Read a Mixture in HTK format

Parameters

filename – name of the file to read from
begin_hmm – boolean
state2 – boolean

sv_size()[source]¶

Return the dimension of the super-vector

Returns: an integer, size of the mean super-vector

validate()[source]¶

Verify the format of the Mixture

Returns: a boolean giving the status of the Mixture

static variance_control(cov, flooring, ceiling, cov_ctl)[source]¶

variance_control for Mixture (florring and ceiling)

Parameters

cov – covariance to control
flooring – float, florring value
ceiling – float, ceiling value
cov_ctl – co-variance to consider for flooring and ceiling

Previous topic

Next topic

This Page

Mixture¶