Dear all,
On Friday, May 2nd at 2:00pm, Johanna Devaney and Michael Mandel, of
Ohio State University, will present two seminars back-to-back,
entitled "Analyzing recorded vocal performances" and "Strong models
for understanding sounds in mixtures", respectively, in ENG 2.09 (the
Engineering building) at Queen Mary University of London, Mile End
Road, London E1 4NS. Details of the talks follow.
Information on how to access the school can be found at
http://www.eecs.qmul.ac.uk/about/campus-map.php. If you are coming
from outside Queen Mary, please let me know, so that I can provide
detailed directions and make sure no-one is stuck outside the doors.
If you wish to be added to / removed from our mailing list as an
individual recipient, please send me an email and I'll be happy to do
so.
**
Speaker 1: Johanna Devaney
Title: Analyzing Recorded Vocal Performances
Abstract:
A musical performance can convey both the musicians' interpretation of
the written musical score as well as emphasize, or even manipulate,
the emotional content of the music through small variations in timing,
dynamics, tuning, and timbre. This talk presents my work on
score-guided automatic musical performance analysis, as well as my
investigations into vocal intonation practices. The score-audio
alignment algorithm I developed to estimate note locations makes use
of a hybrid DTW-HMM multi-pass approach that is able to capture onset
and offset asynchronies between simultaneously notated chords in
polyphonic music. My work on vocal intonation practices has examined
both solo and ensemble singing, with a particular focus on the role of
musical training, the presence and/or type of accompaniment, and the
organization of musical materials on intonation.
Bio:
Johanna Devaney is an assistant professor of music theory and
cognition at The Ohio State University. Her research applies a range
of interdisciplinary approaches to the study of musical performance,
motivated by a desire to understand how performers mediate listeners'
experience of music. Her work on extracting and analyzing performance
data, with a particular focus on intonation in the singing voice,
integrates the fields of music theory, music perception and cognition,
signal processing, and machine learning. She has released a number of
the tools she has developed in the open-source Automatic Music
Performance and Comparison Toolkit (
www.ampact.org). Johanna completed
her PhD at the Schulich School of Music of McGill University. She also
holds an M.Phil. degree from Columbia University, as well as an MA
from York University in Toronto. Before working at Ohio State, she was
a postdoctoral scholar at the Center for New Music and Audio
Technologies (CNMAT) at the University of California, Berkeley.
**
Speaker 2: Michael Mandel
Title: Strong models for understanding sounds in mixtures
Abstract:
Human abilities to understand sounds in mixtures, for example, speech
in noise, far outstrip current automatic approaches, despite recent
technological breakthroughs. This talk presents two projects that use
strong models of speech to begin to close this gap and discusses their
implications for musical applications. The first project investigates
the human ability to understand speech in noise using a new
data-driven paradigm. By formulating intelligibility prediction as a
classification problem, the model is able to learn the important
spectro-temporal features of speech utterances from the results of
listening test using real speech. It is also able to successfully
generalize to new recordings of the same and similar words. The second
project aims to reconstruct damaged or obscured speech similarly to
the way humans might, by using a strong prior model. In this case, the
prior model is a full large vocabulary continuous speech recognizer.
Posed as an optimization problem, this system finds the latent clean
speech features that minimize a combination of the distance to the
reliable regions of the noisy observation and the negative log
likelihood under the recognizer. It reduces both speech recognition
errors and the distance between the estimated speech and the original
clean speech.
Bio:
Michael I Mandel earned his BSc in Computer Science from the
Massachusetts Institute of Technology in 2004 and his MS and PhD with
distinction in Electrical Engineering from Columbia University in 2006
and 2010 as a Fu Foundation School of Engineering and Applied Sciences
Presidential Scholar. From 2009 to 2010 he was an FQRNT Postdoctoral
Research Fellow in the Machine Learning laboratory at the Université
de Montréal. From 2010 to 2012 he was an Algorithm Developer at
Audience Inc, a company that has shipped over 350 million noise
suppression chips for cell phones. He is currently a Research
Scientist in Computer Science and Engineering at the Ohio State
University where he recently received an Outstanding Undergraduate
Research Mentor award. His research applies signal processing and
machine learning to computational audition problems including source
separation, robust speech recognition, and music classification and
tagging.
Other upcoming C4DM Seminars:
Richard Foss (Rhodes University), Thursday 1 May 2014, 2:00pm ("The
delights and dilemmas associated with sending audio over networks")
Matt McVicar (AIST Japan), Monday 12 May 2014, 3:30pm ("Towards the
automatic transcription of lyrics from audio")
Paul Weir (Aardvark Swift Recruitment, Audio director of Soho
Productions), Wednesday 21 May 2014, 3:00pm