Segmentation

segmentation.adjust(cep, diarization)[source]

Moves the border of segment of diarization into lowest energy region and split segments gretter than 30s

Todo:

changes numpy.convolve to the panada version

Parameters:
  • cep – a numpy.ndarray containing MFCC
  • diarization – a Diarization object
Returns:

a Diar object

segmentation.bic_linear(cep, diarization, alpha, sr=False)[source]

This segmentation over the signal fuses consecutive segments of the same speaker from the start to the end of the record. The measure employs the \Delta BIC based on Bayesian Information Criterion , using full covariance Gaussians (see gauss.GaussFull), as defined in equation below.

\Delta BIC_{i,j} = PBIC_{i+j} - PBIC_{i} - PBIC_{j} -  P

PBIC_{x}  = \frac{n_x}{2} \log|\Sigma_x|

cst  = \frac{1}{2} \alpha \left(d + \frac{d(d+1)}{2}\right)

P  = cst + log(n_i+n_j)

where |\Sigma_i|, |\Sigma_j| and |\Sigma| are the determinants of gaussians associated to the left and right segmnents i, j and i+j. \alpha is a parameter to set up. The penalty factor P depends on d, the dimension of the cep, as well as on n_i and n_j, refering to the total length of left segment i and right segment j respectively.

if sr is True, BIC distance is replace by the square root bic (see clustering.hac_utils.bic_square_root())

Parameters:
  • cep – numpy.ndarray
  • diarization – a Diarization object
  • alpha – the threshold
  • sr – boolean
Returns:

a Diar object

segmentation.div_gauss(cep, show='empty', win=250, shift=0)[source]

Segmentation based on divergence gaussien.

The segmentation detects the instantaneous change points corresponding to segment boundaries. The proposed algorithm is based on the detection of local maxima. It detects the change points through a gaussian divergence (see equation below), computed using Gaussians with diagonal covariance matrices. The left and right gaussians are estimated over a five-second window sliding along the whole signal (2.5 seconds for each gaussian, given win =250 features). A change point, i.e. a segment boundary, is present in the middle of the window when the gaussian diverence score reaches a local maximum.

GD(s_l,s_r)=(\mu_r-\mu_l)^t\Sigma_l^{-1/2}\Sigma_r^{-1/2}(\mu_r-\mu_l)

where s_l is the left segment modeled by the mean \mu_l and the diagonal covariance matrix \Sigma_l, s_l is the right segment modeled by the mean \mu_r and the diagonal covariance matrix \Sigma_r.

Parameters:
  • cep – numpy array of frames
  • show – speaker of the show
  • win – windows size in number of frames
Returns:

a diarization object (s4d annotation)

segmentation.init_seg(cep, show='empty', cluster='init')[source]

Return a initial segmentation composed of one segment from the first to the last feature in cep.

Parameters:
  • cep – numpy.ndarry containing MFCC
  • show – the speaker of the cep
  • cluster – str
Returns:

a Diar object

segmentation.sanity_check(cep, show, cluster='init')[source]

Removes equal MFCC of cep and return a diarization.

Parameters:
  • cep – numpy.ndarry containing MFCC
  • show – speaker of the show
Returns:

a dirization object