Segmentation¶

segmentation.adjust(cep, diarization)[source]¶

Moves the border of segment of diarization into lowest energy region and split segments gretter than 30s

Todo:	changes numpy.convolve to the panada version
Parameters:	cep – a numpy.ndarray containing MFCC diarization – a Diarization object
Returns:	a Diar object

segmentation.bic_linear(cep, diarization, alpha, sr=False)[source]¶

This segmentation over the signal fuses consecutive segments of the same speaker from the start to the end of the record. The measure employs the $\Delta BIC$ based on Bayesian Information Criterion , using full covariance Gaussians (see gauss.GaussFull), as defined in equation below.

$\Delta BIC_{i,j} = PBIC_{i+j} - PBIC_{i} - PBIC_{j} - P$

$PBIC_{x} = \frac{n_x}{2} \log|\Sigma_x|$

$cst = \frac{1}{2} \alpha \left(d + \frac{d(d+1)}{2}\right)$

$P = cst + log(n_i+n_j)$

where $|\Sigma_i|$ , $|\Sigma_j|$ and $|\Sigma|$ are the determinants of gaussians associated to the left and right segmnents $i$ , $j$ and $i+j$ . $\alpha$ is a parameter to set up. The penalty factor $P$ depends on $d$ , the dimension of the cep, as well as on $n_i$ and $n_j$ , refering to the total length of left segment $i$ and right segment $j$ respectively.

if sr is True, BIC distance is replace by the square root bic (see clustering.hac_utils.bic_square_root())

Parameters:	cep – numpy.ndarray diarization – a Diarization object alpha – the threshold sr – boolean
Returns:	a Diar object

segmentation.div_gauss(cep, show='empty', win=250, shift=0)[source]¶

Segmentation based on divergence gaussien.

The segmentation detects the instantaneous change points corresponding to segment boundaries. The proposed algorithm is based on the detection of local maxima. It detects the change points through a gaussian divergence (see equation below), computed using Gaussians with diagonal covariance matrices. The left and right gaussians are estimated over a five-second window sliding along the whole signal (2.5 seconds for each gaussian, given win =250 features). A change point, i.e. a segment boundary, is present in the middle of the window when the gaussian diverence score reaches a local maximum.

$GD(s_l,s_r)=(\mu_r-\mu_l)^t\Sigma_l^{-1/2}\Sigma_r^{-1/2}(\mu_r-\mu_l)$

where $s_l$ is the left segment modeled by the mean $\mu_l$ and the diagonal covariance matrix $\Sigma_l$ , $s_l$ is the right segment modeled by the mean $\mu_r$ and the diagonal covariance matrix $\Sigma_r$ .

Parameters:	cep – numpy array of frames show – speaker of the show win – windows size in number of frames
Returns:	a diarization object (s4d annotation)

segmentation.init_seg(cep, show='empty', cluster='init')[source]¶

Return a initial segmentation composed of one segment from the first to the last feature in cep.

Parameters:	cep – numpy.ndarry containing MFCC show – the speaker of the cep cluster – str
Returns:	a Diar object

segmentation.sanity_check(cep, show, cluster='init')[source]¶

Removes equal MFCC of cep and return a diarization.

Parameters:	cep – numpy.ndarry containing MFCC show – speaker of the show
Returns:	a dirization object

Previous topic

Next topic

This Page

Segmentation¶