How to manage the data: IdMap, Ndx, Key, Scores and StatServer

IdMap

Description

IdMap are used to store two lists of strings and to map between them. Most of the time, IdMap are used to associate names of segments (also referred to as sessions or shows) ` stored in leftids with the ID of their class (that could be a speaker ID, a language ID or any other acoustic class) stored in rightids. Duplicated entries are allowed in each list.

Additionally, and in order to allow more flexibility, IdMap includes two other vectors: start`and `stop which are vectors of floats and can be used to store boudaries of audio segments.

An IdMap object is often used to store together: speaker IDs, segment IDs, start and stop time of the segment and to initialize a StatServer.

Note

When not used, start and stop are set to None meaning that the entire audio segment is selected.

Attribute

Type

leftids

ndarray of strings

rightids

ndarray of strings

start

ndarray of floats

stop

ndarray of floats

Note

all four vectors: leftids, rightids, start, stop must have the same length.

Example

We create here an IdMap where the leftids are the model names and rightids are the segment names. As we consider that all segments are used entirely, start and stop valuesare set to None.

import numpy
import sidekit

idmap = sidekit.IdMap()
idmap.leftids = numpy.array(["model_1", "model_2", "model_2"])
idmap.rightids = numpy.array(["segment_1", "segment_2", "segment_3"])
idmap.start = numpy.empty((3), dtype="|O")
idmap.stop = numpy.empty((3), dtype="|O")

idmap.validate()

In this example, the first model is associated to the first segment while the second model is linked to two segments.

The last line will return True if the format of idmap is correct and False otherwise.

Ndx

Description

Ndx objects store trials index information, i.e., combination of model and segment IDs that should be evaluated by the system which will produce a score for those trials.

The trialmask is a m-by-n matrix of boolean where m is the number of unique models and n is the number of unique segments. If trialmask(i,j) is true then the score between model i and segment j will be computed.

Note

it is possible to use different Ndx with a single Scores object in order to evaluate different subsets of the trials.

Attribute

Type

modelset

ndarray of strings

segset

ndarray of strings

trialmask

matrix of boolean

Example

The code below creates an Ndx object with two models and three segments. All trials will be computed as the trialmask is set to True.

import numpy
import sidekit

ndx = sidekit.Ndx()
ndx.modelset = numpy.array(["model_1", "model_2"])
ndx.segset = numpy.array(["segment_1", "segment_2", "segment_3"])
ndx.trialmask = numpy.ones((2,3), dtype='bool')

ndx.validate()

Keys

Description

Key are used to store information about which trial is a target trial and which one is a non-target (or impostor) trial. tar(i,j) is true if the test between model i and segment j is target. non(i,j) is true if the test between model i and segment j is non-target.

Attribute

Type

modelset

ndarray of strings

segset

ndarray of strings

tar

matrix of boolean

non

matrix of boolean

Example

We create a Key object that corresponds to the previously created Ndx.

import numpy
import sidekit

key = sidekit.Key()
key.modelset = ndx.modelset
key.segset = ndx.segset
key.tar = numpy.zeros((2,3), dtype='bool')
key.tar[0, 0] = True
key.tar[1:, 1:] = True
key.non = numpy.zeros((2,3), dtype='bool')
key.non[0, 1:] = True
key.non[1, 0] = True

key.validate()

Scores

Description

Scores include information about trials, including the lists of unique models and segments as well as the score output by the system. This class duplicate information contained in an Ndx in order not to depend on any Ndx object.

This class has four fields:

Attribute

Type

modelset

ndarray of strings

segset

ndarray of strings

scoremask

matrix of boolean

scoremat

matrix of float (scores)

StatServer

Description

StatServer are used to store and process statistics.

This class has six attributes:

  • a list of models (or class ID)

  • a list of segment IDs (also called shows or sessions)

  • a vector of start time (one for each segment)

  • a vector of stop time (one for each segment)

  • zero-order statistics

  • first-order statistics.

Note

that in SIDEKIT as an abuse of language, i-vectors and super-vectors are referred to as first order statistics.

When Statserver are used to store i-vectors or super-vectors as StatServer.stat1, StatServer.stat0 contains the number of segments (also called sessions or shows) which have been used to estimate the i-vector or super-vector. An advantage of this abuse of language is that a single Factor Analysis implementation can be used to train Joint Factor Analysis (JFA), Total Variability or Probabilistic Linear Discriminant Analysis (PLDA).

Attribute

Type

modelset

ndarray of strings

segset

ndarray of strings

start

ndarray of floats

stop

ndarray of floats

stat0

2D-ndarray of floats

stat1

2D-ndarray of floats

Note

The size of modelset, segset, start, stop, as well as the first dimension of stat0 and stat1 must be equal. The second dimension of stat1 must be a multiple of the second dimension of stat0 (usually, stat0.shape[1] is the number of distributions of a GMM and stat1.shape[1] is the number of distributions times the dimension of the acoustic features).

Note

StatServer are often instantiated using an IdMap.

Example

Using the previously defined IdMap, a FeaturesServer (see FeaturesServer for more details) and a Mixture the following code initialize a StatServer and accumulates sufficient statistics.

The new StatServer verify:

stat_server.modelset == idmap.leftids
stat_server.segset == idmap.rightids

And statstics are coherent with the size of the GMM.