Mixture¶
-
class
mixture.
Mixture
(mixture_file_name='', name='empty')[source]¶ A class for Gaussian Mixture Model storage. For more details about Gaussian Mixture Models (GMM) you can refer to [Bimbot04].
- Attr w
array of weight parameters
- Attr mu
ndarray of mean parameters, each line is one distribution
- Attr invcov
ndarray of inverse co-variance parameters, 2-dimensional for diagonal co-variance distribution 3-dimensional for full co-variance
- Attr invchol
3-dimensional ndarray containing upper cholesky decomposition of the inverse co-variance matrices
- Attr cst
array of constant computed for each distribution
- Attr det
array of determinant for each distribution
-
EM_diag2full
(diagonal_mixture, features_server, featureList, iterations=2, num_thread=1)[source]¶ Expectation-Maximization estimation of the Mixture parameters.
- Parameters
features_server – sidekit.FeaturesServer used to load data
featureList – list of feature files to train the GMM
iterations – list of iteration number for each step of the learning process
num_thread – number of thread to launch for parallel computing
- Return llk
a list of log-likelihoods obtained after each iteration
-
EM_split
(features_server, feature_list, distrib_nb, iterations=1, 2, 2, 4, 4, 4, 4, 8, 8, 8, 8, 8, 8, num_thread=1, llk_gain=0.01, save_partial=False, output_file_name='ubm', ceil_cov=10, floor_cov=0.01)[source]¶ Expectation-Maximization estimation of the Mixture parameters.
- Parameters
features_server – sidekit.FeaturesServer used to load data
feature_list – list of feature files to train the GMM
distrib_nb – final number of distributions
iterations – list of iteration number for each step of the learning process
num_thread – number of thread to launch for parallel computing
llk_gain – limit of the training gain. Stop the training when gain between two iterations is less than this value
save_partial – name of the file to save intermediate mixtures, if True, save before each split of the distributions
ceil_cov –
floor_cov –
- Return llk
a list of log-likelihoods obtained after each iteration
-
EM_uniform
(cep, distrib_nb, iteration_min=3, iteration_max=10, llk_gain=0.01, do_init=True)[source]¶ Expectation-Maximization estimation of the Mixture parameters.
- Parameters
cep – set of feature frames to consider
cep – set of feature frames to consider
distrib_nb – number of distributions
iteration_min – minimum number of iterations to perform
iteration_max – maximum number of iterations to perform
llk_gain – gain in term of likelihood, stop the training when the gain is less than this value
do_init – boolean, if True initialize the GMM from the training data
- Return llk
a list of log-likelihoods obtained after each iteration
-
compute_log_posterior_probabilities
(cep, mu=None)[source]¶ Compute log posterior probabilities for a set of feature frames.
- Parameters
cep – a set of feature frames in a ndarray, one feature per row
mu – a mean super-vector to replace the ubm’s one. If it is an empty vector, use the UBM
- Returns
A ndarray of log-posterior probabilities corresponding to the input feature set.
-
compute_log_posterior_probabilities_full
(cep, mu=None)[source]¶ Compute log posterior probabilities for a set of feature frames.
- Parameters
cep – a set of feature frames in a ndarray, one feature per row
mu – a mean super-vector to replace the ubm’s one. If it is an empty vector, use the UBM
- Returns
A ndarray of log-posterior probabilities corresponding to the input feature set.
-
dim
()[source]¶ Return the dimension of distributions of the Mixture
- Returns
an integer, size of the acoustic vectors
-
distrib_nb
()[source]¶ Return the number of distribution of the Mixture
- Returns
the number of distribution in the Mixture
-
get_distrib_nb
()[source]¶ Return the number of Gaussian distributions in the mixture :return: then number of distributions
-
get_invcov_super_vector
()[source]¶ Return Inverse covariance super-vector
- Returns
an array, super-vector of the inverse co-variance coefficients
-
get_mean_super_vector
()[source]¶ Return mean super-vector
- Returns
an array, super-vector of the mean coefficients
-
merge
(model_list)[source]¶ Merge a list of Mixtures into a new one. Weights are normalized uniformly :param model_list: a list of Mixture objects to merge
-
read
(mixture_file_name, prefix='')[source]¶ Read a Mixture in hdf5 format
- Parameters
mixture_file_name – name of the file to read from
prefix –
-
static
read_htk
(filename, begin_hmm=False, state2=False)[source]¶ Read a Mixture in HTK format
- Parameters
filename – name of the file to read from
begin_hmm – boolean
state2 – boolean
-
sv_size
()[source]¶ Return the dimension of the super-vector
- Returns
an integer, size of the mean super-vector