Supplementary audio examples for speech separation and recognition challenge for "Joint Speaker Identification and Speech Separation for Single-Channel Speech Separation Challenge, in IEEE Trans.

P. Mowlaee, R. Saeidi, M. G. Christensen, Z.-H. Tan, T. Kinnunen, P. Fränti, S. H. Jensen

In this page you will find separated wave files obtained by the proposed method and other benchmarks on the test data in the speech separation and recognition challenge [1]. The routines written for conducting the listening tests are available at [Matlabcodes] containing routines for MUSHRA and speech intelligibility tests described in [7].

Table I. Labels of the methods used in the multi-stimulus test with hidden reference and anchors (MUSHRA test) [5] and Intelligibility tests [6].

Mixture No.Speakers' identitiesTarget (Ref.)Masker (Ref.)Target [2]Masker [2]Target [3]Masker [3]Target [4]Masker [4]
Mixture 1 (Same talker)Spkr 33 + Spkr 33wavewavewavewavewavewavewavewave
Mixture 2 (Same talker)Spkr 5 + Spkr 5wavewavewavewavewavewavewavewave
Mixture 3 (Different gender)Spkr 14 + Spkr 22wavewavewavewavewavewavewavewave
Mixture 4 (Same gender)Spkr 6 + Spkr 30wavewavewavewavewavewavewavewave

References

[1] M. Cooke, J.R. Hershey, and S.J. Rennie, “Monaural speech separation and recognition challenge,” Elsevier Computer Speech and Language, vol. 24, no. 1, pp. 1–15, 2010.
[2] J. R. Hershey, S. J. Rennie, P. A. Olsen, and T. T. Kristjansson, “Superhuman multi-talker speech recognition: A graphical modeling approach,” Elsevier Computer Speech and Language, vol. 24, no. 1, pp. 45–66, Jan 2010.
[3] R. J. Weiss and D. P. W. Ellis, “Speech separation using speaker-adapted eigenvoice speech models,” Elsevier Computer Speech and Language, vol. 24, no. 1, pp. 16–29, 2010.
[4] P. Mowlaee, R. Saeidi, Z.-H, Tan, M. G. Christensen, T. Kinnunen, P. Fränti, S. H. Jensen, “A Joint Approach for single-channel Speaker Identification and Speech Separation”, IEEE Transactions on Audio, Speech and Language Processing, vol. 20, no. 9, pp. 2586-2601, 2012.[ PDF]
[5] “Method for the subjective assessment of intermediate quality level of coding systems.,” ITU-R BS.1534-1, 2003.
[6] J. Barker and M. Cooke, “Modelling speaker intelligibility in noise,” Speech Communication, vol. 49, no. 5, pp. 402–417, 2007.
[7] P. Mowlaee, R. Saeidi, M. G. Christensen, R. Martin, “Subjective and Objective Quality Assessment of Single-channel Speech Separation Algorithms”, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 69 - 72, March 2012.

Myhomepage
AUDIS

Attachments