|
Based on
our discussions here are the tentative projects. I will be happy
to discuss alternatives/details etc.
Groups/Projects
- Real-time pitch detection for monophonic sounds possibly
implemented as a MIDI device or VST plugin - Tony and Lacey Antoniou
- World music classification for particular regions/styles
possibly incorporating singer identification information - Dale Lyons and Onat Yazir
- Visualization,
browsing and retrieval of drum loops and possibly music based on
rhythmic patterns using the e-drum as a control interface - David Sprague and Adam Tindale
- Error detection in music performance of computer-generated
music notation at different levels - Graham
Percival and Lucas Longley
- Event detection
and visualization of trumpet sounds using a psychoacoustically
motivated approach - Aaron Hilton,
Nathan McDonald
- Rotational sensors of music composition and analysis - Ryan Willoghby
- Identification of recording style/producer from audio
signals - Randy Jones
- Music Caricatures of audio using drums and chors - Keith
Chan and Terence Nathan
Other Ideas
The
following are some representative ideas for projects for this class. Feel free to
ask me questions/propose your own projects/modify the
descriptions etc. The ordering of the projects has no
significance. A
variety of existing source code, programs, toolboxes, datasets will
help you get
started with each project. Some of them can be found in the software
section of this website. Each project has several unique
characteristics and they differ in the skills required, type of
programming, availiability of existing work and many other factors.
Feel free to contact me via email or in person to clarify any questions
you might have regarding these projects. Groups should preferably
consist of 2 or 3 partners. Exception to this rule are possible but not
recommended. The expectations about each project will be adjusted
according to the number of students in the group. Most projects will
consist of the following stages:
1) literature review
2) data collection
3) ground-truth annotation
4) implementation
5) debugging
6) evaluation
7) written report
- Genre classification on
MIDI data
Genre classification has mostly been explored in the audio
domain for some time. More recently algorithms for genre classification
based on statistics/features over symbolic data such as MIDI files have
appeared in the literature. The goal of this project would be to
recreate some of the existing algorithms and investigate alternatives.
- A framework for
evaluating similarity retrieval for music
A variety of similarity-based retrieval algorithms have been
proposed for music in audio-format. The only way to reliably evaluate
content-based similarity retrieval is to conduct user studies. The goal
of this project is to build a framework (possibly web-based) that would
allow different algorithms for audio similarity to be used and
evaluated by users. The main challenge would be to design the framework
to be flexible in the way the algorithms are evaluated, the similarity
measure, the presentation mode etc.
- Sensor-based MIR
One
of the less explored areas in MIR is the interface of MIR systems to
the user. As more and more music is available in portable digital music
players of various forms and sizes we should envision how MIR can be
used on these devices. This project is going to explore how sensor
technology such as piezos, knobs, sliders can be used for browsing
music collections, specifying music queries (for example tapping a
query or playing a melody), and for annotation such onset detection and
beat locations.
- Automatic beat
detection
There are many
algorithms proposed for automatic beat detection in the MIR literature
and this research actually predates MIR. However there is little
detailed experimental comparison between different approaches. A recent
pleasant exception has been the tempo inference contest at the last
ISMIR in Barcelona, 2004. In this projects, students will implement 2-3
of the most well-known algorithms published in the literature, collect
ground truth and do some experimental comparisons.
- Key finding in polyphonic audio
There has been some existing work on key finding on symbolic
scores. In addition, pitch-based representations such as Chroma vectors
or Pitch Histograms have been shown to be effective for alignment,
structural analysis and classification. This project will explore the
use of pitch-based representations in order to identify the key in
polyphonic audio recordings.
- Query-by-humming
front-end
The first stage in a QBH system is to convert a recording of
a human singing, humming or whistling into either a pitch contour or
note sequence that can then be used to search a database of musical
pieces for a match. A large variety of pitch detection algorithms have
been proposed in literature. This project will explore different pitch
detection algorithms as well as note segmentation strategies
- Query-by-humming back-end
Once either a pitch contour or a series of notes have been
extracted they can be converted to some representation that can then be
used to search a database of melodies for approximate matches. In this
project some of the major approaches that have been proposed for
representing melodies and searching melodic databases will be
implemented.
- ThemeFinder In
order to search for melodic fragments in polyphonic music it is
necessary to extract the most important "themes" of a polyphonic
recording. This can be done by incorporating knowledge from voice
leading, MIDI instrument labels, amount of repetition, melodic shape
and many other factors. The goal of this project is to implement a
theme finder using both techniques described in the literature as well
as exploring alternatives.
- Structural analysis
based on similarity matrix
The similarity matrix is a visual representation that shows
the internal structure of a piece of music (chorus-verse, measures,
beats). By analyzing this representation it is possible to reconstruct
the structural form of a piece of music such as AABA.
- Drum pattern
similarity retrieval
Drums are part of a large number of musical pieces. There are
many software packages that provide a wide variety of drum
loops/pattern that can be used to create music. Typically these large
drum loop collections can only by browsed/searched based on filename.
The aim of this project is to explore how the actual
sound/structural similarity between drum patterns can be
exploited for finding drum loops that are "similar". Accurate drum
pattern classification/similarity can potentially lead to significant
advances in audio MIR as most of recorded music today is characterized
by drums and their patterns.
- Drum detection in
polyphonic audio
Recently researchers have started looking at the problem of
identifying individual drum sounds in polyphonic music recordings such
as hihat, bass drum etc. In this project, students will implement some
of these new algorithms and explore variations and alternative
approach. A significant part of the project will consists of building
tools for obtaining ground truth annotations as well as evaluating the
developed algorithms.
- Content-based audio
analysis using plugins
Many of the existing software music players such as WinAmp or
itunes provide an API for writing plugins. Although typically geared
toward spectrum visualization these plugins could pottentially be used
as a front-end for feature extraction, classification and similarity
retrieval. This project will expore this possibility.
- Chord-detection in
polyphonic audio
Even though polyphonic transcription of general audio is
still far from being solved a variety of pitch-based representations
such as chroma-vectors and pitch histograms have been proposed for
audio. There is some limited research on using such representations
potentially with some additional knowledge (such as likely chord
progression) to perform chord detection in polyphonic audio signals.
The goal of this project is to explore possibilities in that space.
Jazz standards or beatles tunes might be a good starting point for
data.
- Polyphonic alignment of audio and MIDI
A symbolic score even in a "low" level format such as MIDI
contains a wealth of useful information that is not directly available
in the acoustic waveforms (beats/measures/chords etc). On the other
hand most of the time we are interesting in hearing actual music rather
than bad sounding MIDI files. In polyphonic audio alignment the idea is
to compute features on both the audio and MIDI data and try to align
the two sequences of features. This project will implement some of the
existing approaches to this problem and explore alternatives and
variations.
- Music Caricatures
Even though we are still a long way from full polyphonic
transcription music information retrieval are increasingly extracting
more and more higher-level information about audio signals. The idea
behind this project is to use this information to create musical
"caricatures" of the original audio using MIDI. The only constrain is
that the resulting "caricature" should somehow match possibly in
a funny way the original music.
- Comparison of
algorithms for audio-segmentation
Audio segmentation referes to the process of detecting when
there is a change of audio "texture" such as the change from singing to
instrumental background, the change from an orchestra to guitar solo,
etc. A variety of algorithms have been proposed for audio segmentation.
The goal of this project is to implement the main approaches and
explore alternatives and variants.
- Music Information
Retrieval using MPEG-7 low level
descriptors
The MPEG-7 standard was recently proposed for standarizing
some of the ways multimedia content is described. Part of it describes
some audio descriptors that can be used to characterize audio
signals. There has been little evaluation of those descriptors compared
to more other feature front-ends proposed in the literature. The
goal of this project is to implement the MPEG-7 audio descriptors and
compare them with other features in a variety of tasks such as
similarity retrieval, classification and segmentation.
- Instrumentation-based
genre/style classification
The type of instruments used in a song can be a quite
reliable indicator of a particular musical genre. For example the
significant presense of saxophone probably implies a jazz tune. Even
though these rules always have exceptions they still will probably work
for many cases. The goal of this project is to explore the use of
decision trees for automatically finding and using such
instrumentation-based rules. A significant part of the project will
consist of collecting instrumentation annotation data.
- Template-based
detection of instrumentation
The goal of this project is to detect what (and maybe when)
instruments are present in an audio recording. The goal is not source
separation or transcription but rather just a presense/absence
indicator for particular instruments. For example from minute 1 to
minute 2 there is a saxophone, piano and drums playing after which a
singer joins the ensemble would be the output of such a system. In
order to identify specific instruments templates will be learned from a
large database of examples and then adapted to the particular
recording.
- Singing-voice
detection
Detecting the segments of a piece of music where there is
singing is the first step in singer identification. This is a classic
classification problem which is made difficult by the large variety of
singers and instrumental backgrounds. The goal of this project is to
explore various proposed algorithms and feature front-ends for this
task. Specifically the use of phasevocoding techniques for enhancing
the prominent singing voice is a promising area of exploration.
- Singer Identification
The singer identity is major part of the way popular music is
characterized and identified. Most listeners that hear a piece they
haven't heard before can not identify the group until the singer starts
singing. The goal of this project is to explore existing approaches to
singer identification and explore variations and alternatives.
- Male/Female singer
detection
Automatic male/female voice classification has been explored
in the context of the spoken voice. The goal of this project is to
first explore male/female singer detection in monophonic recordings of
singing and then expand this work to polyphonic recordings.
- Direct
manipulation music browsing
Although MIR for historical reasons has been mostly focused
on retrieval a large part of music listening involves browsing and
exploration. The goal of this project is to explore various creative
ways of browsing large collections of music that are direct and provide
constant audio feedback about the user actions.
- Hyperbolic trees
for music collection visualization
Hyperbolic trees are an impressive visualization technique
for representing trees/graphs of documents/images. The goal of this
project is to explore the potential of using this technique for
visualizing large music collections. Of specific interest is the
possibility adjustment of this technique to incorporate content-based
music similarity.
- Playlist
summarization using similarity graphs
Similarity graphs are constructed by using content-based
distances for edges and nodes that correspond to musical pieces. The
goal of this project is to explore how this model could be used to
generate summaries for music playlists i.e a short duration
representation (3 seconds for each song in the playlist) that
summarizes a playlist.
|