from s4d.diar import Diar, Segment
Class Diar¶
Diar
is a class describing an audio/video diarization file. The
diarization file is the most important file in S4D toolkit. All
programs are driven by a diarization file and most of them generate a
diarization file (except trainers generate models).
To get a instance of Diar
:
diar = Diar()
Storage¶
Diar
stores a list of segments (Diar.segments
) and it contains
the list of the attribut name of the segments (Diar.attr_names
).
A segment is a list composed of n attributs. The attribut at position
i
is named by Diar.attr_names[i]
. Attributs could by added or
removed. The basic segment is composed of: * field 0: the name of the
show, * field 1 :the label of the segment, eg the name of the speaker,
* field 2 :the label type in [‘speaker’, ‘head’] (the available type is
stored in the list Diar.type_labels
), * field 3 :the start
corresponding to a feature (a time in centi seconde), * field 4 :the
stop corresponding to a feature (a time in centi seconde).
A segment is a portion of a show (audio or video file) with label
as
annotation. It defines as a kind of slice from start
through
end
-1. The unit is the frame rate. A diarization could draw data
from several shows. It is very useful in a batch mode context (training
of model, computing log likelihood ratio, cross-show diarization, etc.).
print(diar)
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop']
]
Add segments¶
There is 4 methods to add segment into a Diar
: * Diar.append
takes the named arguments available in Diar.attr_names
, *
Diar.insert
takes the named arguments available in
Diar.attr_names
and insert a segment at a given position, *
Diar.append_seg
takes a Segment
instance, *
Diar.append_diar
copy the list of segment given in agument.
The example below show how to append 5 segments into diar
:
diar.append(show='foo', label='name', start=0, stop=100)
diar.append(show='foo', label='name', start=100, stop=200)
diar.append(show='foo', label='name', start=300, stop=400)
diar.append(show='foo', label='name', start=350, stop=450)
diar.append(show='foo', label='name', start=310, stop=320)
diar.append(show='foo', label='name', start=470, stop=500)
print(diar)
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop']
row 0: ['foo', 'name', 'speaker', 0, 100]
row 1: ['foo', 'name', 'speaker', 100, 200]
row 2: ['foo', 'name', 'speaker', 300, 400]
row 3: ['foo', 'name', 'speaker', 350, 450]
row 4: ['foo', 'name', 'speaker', 310, 320]
row 5: ['foo', 'name', 'speaker', 470, 500]
]
Get and set segment¶
import copy
diar[0]
seg = copy.deepcopy(diar[0])
seg['label'] = 'name2'
seg['stop'] = 200
diar[1] = seg
print(diar)
diar.rename('label', ['name'], 'name1')
print(diar)
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop']
row 0: ['foo', 'name', 'speaker', 0, 100]
row 1: ['foo', 'name2', 'speaker', 0, 200]
row 2: ['foo', 'name', 'speaker', 300, 400]
row 3: ['foo', 'name', 'speaker', 350, 450]
row 4: ['foo', 'name', 'speaker', 310, 320]
row 5: ['foo', 'name', 'speaker', 470, 500]
]
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop']
row 0: ['foo', 'name1', 'speaker', 0, 100]
row 1: ['foo', 'name2', 'speaker', 0, 200]
row 2: ['foo', 'name1', 'speaker', 300, 400]
row 3: ['foo', 'name1', 'speaker', 350, 450]
row 4: ['foo', 'name1', 'speaker', 310, 320]
row 5: ['foo', 'name1', 'speaker', 470, 500]
]
Attributs¶
Attributs could be add or delete. To add an attribut named gender
in
each segment and initialize the value with unk
use
Diar.att_attribute
.
diar.add_attribut('gender', 'unk')
print(diar)
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender']
row 0: ['foo', 'name1', 'speaker', 0, 100, 'unk']
row 1: ['foo', 'name2', 'speaker', 0, 200, 'unk']
row 2: ['foo', 'name1', 'speaker', 300, 400, 'unk']
row 3: ['foo', 'name1', 'speaker', 350, 450, 'unk']
row 4: ['foo', 'name1', 'speaker', 310, 320, 'unk']
row 5: ['foo', 'name1', 'speaker', 470, 500, 'unk']
]
Extend or shorten segments¶
- Merge consecutive segments with same label:
import copy
diar_pack = copy.deepcopy(diar)
diar_pack.pack()
print(diar_pack)
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender']
row 0: ['foo', 'name1', 'speaker', 0, 100, 'unk']
row 1: ['foo', 'name2', 'speaker', 0, 200, 'unk']
row 2: ['foo', 'name1', 'speaker', 300, 320, 'unk']
row 3: ['foo', 'name1', 'speaker', 350, 450, 'unk']
row 4: ['foo', 'name1', 'speaker', 470, 500, 'unk']
]
- Remove small gap (<
epsilon
) between consecutive segments and merge them:
diar_pack.pack(epsilon=20)
print(diar_pack)
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender']
row 0: ['foo', 'name1', 'speaker', 0, 100, 'unk']
row 1: ['foo', 'name2', 'speaker', 0, 200, 'unk']
row 2: ['foo', 'name1', 'speaker', 300, 320, 'unk']
row 3: ['foo', 'name1', 'speaker', 350, 500, 'unk']
]
- Remove epsilon to the start and stop of each segment
diar_pad = copy.deepcopy(diar)
print(diar_pad)
diar_pad.pad(epsilon=20)
print(diar_pad)
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender']
row 0: ['foo', 'name1', 'speaker', 0, 100, 'unk']
row 1: ['foo', 'name2', 'speaker', 0, 200, 'unk']
row 2: ['foo', 'name1', 'speaker', 300, 400, 'unk']
row 3: ['foo', 'name1', 'speaker', 350, 450, 'unk']
row 4: ['foo', 'name1', 'speaker', 310, 320, 'unk']
row 5: ['foo', 'name1', 'speaker', 470, 500, 'unk']
]
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender']
row 0: ['foo', 'name1', 'speaker', 0, -10, 'unk']
row 1: ['foo', 'name2', 'speaker', -10, 220, 'unk']
row 2: ['foo', 'name1', 'speaker', 280, 300, 'unk']
row 3: ['foo', 'name1', 'speaker', 300, 340, 'unk']
row 4: ['foo', 'name1', 'speaker', 340, 460, 'unk']
row 5: ['foo', 'name1', 'speaker', 470, 500, 'unk']
]
- Apply a collar to each segment.
diar_col = copy.deepcopy(diar)
diar_col.pad(epsilon=20)
print(diar_col)
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender']
row 0: ['foo', 'name1', 'speaker', 0, -10, 'unk']
row 1: ['foo', 'name2', 'speaker', -10, 220, 'unk']
row 2: ['foo', 'name1', 'speaker', 280, 300, 'unk']
row 3: ['foo', 'name1', 'speaker', 300, 340, 'unk']
row 4: ['foo', 'name1', 'speaker', 340, 460, 'unk']
row 5: ['foo', 'name1', 'speaker', 470, 500, 'unk']
]
Read and write¶
seg_diar = Diar.read_seg('data/ref/20041219_1300_1314_RTM_ELDA.seg') # LIUM format
mdtm_diar = Diar.read_mdtm('data/ref/20041219_1300_1314_RTM_ELDA.mdtm') # MDTM format
rttm_diar = Diar.read_rttm('data/ref/20041219_1300_1314_RTM_ELDA.rttm') # RTTM format
uem_diar = Diar.read_uem('data/ref/20041219_1300_1314_RTM_ELDA.uem') # UEM format
Diar.write_seg('data/out/20041223_1300_1318_RTM_ELDA.out.seg', seg_diar)
Diar.write_seg('data/out/20041223_1300_1318_RTM_ELDA.mdtm.seg', mdtm_diar)
Diar.write_seg('data/out/20041223_1300_1318_RTM_ELDA.rttm.seg', rttm_diar)
Diar.write_seg('data/out/20041223_1300_1318_RTM_ELDA.uem.seg', uem_diar)
Link with sidekit.FeatureServer
¶
- convert a segmentation into a
id_map
forsidekit.StatServer
:
from sidekit.statserver import StatServer
id_map = seg_diar.id_map()
stat_ser = StatServer(id_map)
print(id_map)
<sidekit.bosaris.idmap.IdMap object at 0x105dcab70>
Data extraction¶
- Get the unique values of an attribut:
label_list = seg_diar.unique('label')
print(label_list)
['20041219_1300_1314_RTM_ELDA_speaker#1', 'Amal', 'Driss_Abbadi', 'Mustapha_Lakhsem', 'Najib_Kettani']
- Get a new
Diar
according a comparaison expression:
label_filter = seg_diar.filter('label', '==', 'Samira')
print(label_filter)
time_filter = seg_diar.filter('start', '>', 100000)
print(time_filter)
label_filter = seg_diar.filter('label', 'in', ['Samira', '20041223_1300_1318_RTM_ELDA_speaker#1_20041223-1300-1318'])
print(label_filter)
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender', 'env', 'channel']
]
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender', 'env', 'channel']
]
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender', 'env', 'channel']
]
- create an segment index. Index is an implementation of perl’s autovivification feature.
idx = seg_diar.make_index(['show', 'label'])
for show in idx:
print(show)
for label in idx[show]:
ch = label+': '
for seg in idx[show][label]:
ch += ' '+str(seg['start'])
print(ch)
20041219_1300_1314_RTM_ELDA
20041219_1300_1314_RTM_ELDA_speaker#1: 0 1237
Amal: 2224 7479 22955 35901 36886 53636 58317 58870 61858 63205 64099 65762 75879 76518 81080 87085
Driss_Abbadi: 16049
Mustapha_Lakhsem: 60910 62856 63472
Najib_Kettani: 30693
Class: Segment¶
Segment implements class methods: intersection, union, diff, gap.
print(diar)
seg0 = diar[0]
seg1 = diar[1]
print('intersection 0 and 1: ',Segment.intersection(seg0, seg1))
print('intersection 2 and 3: ', Segment.intersection(diar[2], diar[3]))
print('diff 0 and 1: ',Segment.diff(seg0, seg1))
print('diff 2 and 3: ', Segment.diff(diar[2], diar[3]))
print('union 0 and 1: ',Segment.union(seg0, seg1))
print('union 2 and 3: ', Segment.union(diar[2], diar[3]))
print('gap 0 and 1: ',Segment.gap(seg0, seg1))
print('gap 2 and 3: ', Segment.gap(diar[2], diar[3]))
[
attribut definition : ['show', 'label', 'label_type', 'start', 'stop', 'gender']
row 0: ['foo', 'name1', 'speaker', 0, 100, 'unk']
row 1: ['foo', 'name2', 'speaker', 0, 200, 'unk']
row 2: ['foo', 'name1', 'speaker', 300, 400, 'unk']
row 3: ['foo', 'name1', 'speaker', 350, 450, 'unk']
row 4: ['foo', 'name1', 'speaker', 310, 320, 'unk']
row 5: ['foo', 'name1', 'speaker', 470, 500, 'unk']
]
intersection 0 and 1: ['foo', 'name1 / name2', 'speaker', 0, 100, 'unk']
intersection 2 and 3: ['foo', 'name1 / name1', 'speaker', 350, 400, 'unk']
diff 0 and 1: ([['foo', 'name1', 'speaker', 100, 200, 'unk']], [2])
diff 2 and 3: ([['foo', 'name1', 'speaker', 300, 350, 'unk'], ['foo', 'name1', 'speaker', 400, 450, 'unk']], [1, 2])
union 0 and 1: ['foo', 'name1', 'speaker', 0, 200, 'unk']
union 2 and 3: ['foo', 'name1', 'speaker', 300, 450, 'unk']
gap 0 and 1: ['foo', 'name1', 'speaker', 100, 0, 'unk']
gap 2 and 3: ['foo', 'name1', 'speaker', 400, 350, 'unk']