Unsupervised learning of temporal and spectral features of speech and music


Summary:

In this project we like to explore algorithms for unsupervised learning of slowly varying features of speech and music signals. In particular we will consider the estimation and tracking of the fundamental frequency of single and multiple speakers and of musical instruments, the knowledge of which is useful for signal classification and separation purposes. Based on these features the separation of acoustic sources will be investigated. Considerable effort will be spent on developing low latency implementations of such algorithms

References:

Bell, A.J. and Sejnowski, T.J. (1995): "An information-maximization approach to blind separation and blind deconvolution". Neural Computation 7, 1129-1159.

Vapnik, V. (1995): The nature of statistical learning theory. Springer-Verlag.

Host Lab: RUB