Speaker Diarization System for Lecture Data preview page 1

Vocapia Research SAS

Speaker Diarization System for Lecture Data

Pages

Time to read

23 mins

Language

English

Summary

This research article presents the LIMSI speaker diarization system specifically designed for lecture data, developed as part of the Rich Transcription 2006 Spring (RT-06S) meeting recognition evaluation. The system builds upon a baseline diarization framework initially created for broadcast news data, which utilizes agglomerative clustering based on the Bayesian information criterion along with advanced speaker identification techniques. The article details the challenges faced in adapting the system for lecture data, particularly the high missed speech error rate observed. A new speech activity detection (SAD) approach based on the log-likelihood ratio was explored to address these issues. The paper outlines the methodologies employed in the system, including feature extraction, initial segmentation, and clustering techniques. Experimental results indicate that the adapted system achieved a diarization error of 20.2% on the RT-06S Multiple Distant Microphone data, illustrating the effectiveness of the modifications made for lecture settings.

Vocapia Research SAS

Speaker Diarization System for Lecture Data

Summary

Get the Full Copy