Lecture 1-Spring 2016

CS 582
Intro to Speech Processing
Chuck Konopka CKonopka@mail.sdsu.edu

M W 2:00-3:15pm EBA 439
Lecture I Administration/Organization, An Introduction to the Topic
Wed., 1.21.15
Grading Criteria
3 homework assignments: 15%

1 take home midterm exam: 20%
1 take home final exam: 30%
1 semester team project:
35%
Extra Credit Opportunities: Up to 10%
Major Topics
Modeling
Acoustic Theory of Speech Production and Perception
(How we model the speech and hearing processes)
Acoustic-Phonetics
(How we model acoustic units of speech)
Time-Frequency Analysis
(Techniques for converting and analyzing time-domain data in other domains)
Speech pre-processing
(An implementation of time-frequency analysis for speech data)
Supervised Learning
(How computers learn using examples)
Unsupervised Learning
(How computers can learn from data independently)
Speech Structure
Rule-based Grammar
Statistical Grammar
The Meaning in Speech
Semantic Nets, etc.
Syllabus
(Weeks 1-8)
Week
Subject
Course Introduction
The Really Big Picture: What is Modeling?
Demonstration/Lab: (Using the CSLU Toolkit to build a simple working speech

recognition system)
Selection of Semester Project (1-2 pages)
2-3
The Big Picture
The physical model of speech recognition: Speech production and perception
Deriving a computational model of speech recognition from the physical model
4-6
Machine Learning
Supervised Learning
Unsupervised Learning
Machine Learning Lab (An application of Matlab or Java-based software to a learning

problem)
Midterm review and exam
Take-home midterm exam
Midterm report on Semester Project progress (1-2 pages)
Syllabus
(Weeks 9-16)
Week
Subject
8-10
Hidden Markov Models
The famous 3 lectures
HMM Lab: Simple implementations of key HMM algorithms
11-14
Speech pre-processing
An introduction to Time-frequency analysis
The Fourier Transform (FT) and the Fast Fourier Transform (FFT)
The Wavelet transform (WT) and the Wavelet Packet Transform (WPT)
The Cepstral Transform (CT)
Mel-frequency Cepstral Coefficient Analysis (MFCC)
Signal Processing Lab: Implementation of a signal processing algorithm
14-15
Language Modeling
Rule-based grammar: CFG (Context Free Grammar)
Stochastic grammar: N-Gram models, Probabilistic Context Free Grammar
Semantics
16
Finals Week
Take-home final
Semester Project due at weeks end
Speech Processing Applications

Examples of speech processing applications include:
Speech recognition
Speech synthesis
CSRLU Toolkit
AT&T Natural Voices
Speech effects
pitch bending
Chorus effects
Grammar modeling
Dragon Dictate
SAPI (Microsofts Speech API)
CSRLU Toolkit
Synthetic Shakespeare
Speaker recognition
Acoustic Biometrics
Accent recognition
Language training
Resources
(Things you will need)
Textbook:
Speech And Language Processing, 2nd Edition,
Jurafsky & Martin Prentice Hall, 2009
Matlab/Octave, Audacity, Java, C++, etc.
Various papers to be announced
The Semester Project
The goal of the Semester Project is to apply and generalize the presented concepts by developing a
Big Idea in a team setting.
Big Ideas Some examples of Prior Semester Projects:
The Cocktail Party Effect
Concatenative Speech Synthesis
Prosody Detection & Synthesis
Accent Recognition
Harmony Generation
Emotion Recognition
Synthetic Beatles, Beethoven, etc.
Along the way, youll:
Develop the Big Idea into something you can implement

Develop research and writing skills
Develop team building and coordination skills
Examples of Past Semester Projects
The Cocktail Party Effect
Concatenative Speech Synthesis
Prosody Detection & Synthesis
Accent Recognition
Harmony Generation
Emotion Recognition
Synthetic Beatles, Beethoven, etc.
In Brief
This course will introduce you to the fundamentals of

speech processing and how these concepts can
be applied to other problem domains.
The Big Idea

A perfect understanding of how we understand speech
isnt required to build a system that can recognize
and produce speech.
It is possible to use speech data itself to build a
system that can recognize and produce speech.
The Big Idea is that it is possible to create a
solution using the problem data itself.
A Quick Overview
Natures Model:
Well begin with a definition of a model. Well then take a

look at the biological models of speech production and
perception that serve as the basis for the computational
models of speech.
A Computational Model:
Once we understand the Natural Model, well proceed to

develop a computational model.
How:
Well develop a hierarchy of the building blocks of speech

and then build a system using these components.
These elements are:
The acoustic (audio) elements

The phonetic elements
The structure of speech Grammar,
The meaning in speech (i.e. semantics)

Lecture 1-Spring 2016

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Lecture 1-Spring 2016

Hochgeladen von

Copyright:

Verfügbare Formate

CS 582

Intro to Speech Processing

Chuck Konopka CKonopka@mail.sdsu.edu

3 homework assignments: 15%

(How we model the speech and hearing processes)

(How we model acoustic units of speech)

(Techniques for converting and analyzing time-domain data in other domains)

(An implementation of time-frequency analysis for speech data)

(How computers learn using examples)

(How computers can learn from data independently)

The Meaning in Speech

Semantic Nets, etc.

The Really Big Picture: What is Modeling?

Demonstration/Lab: (Using the CSLU Toolkit to build a simple working speech

Selection of Semester Project (1-2 pages)

The Big Picture

The physical model of speech recognition: Speech production and perception

Deriving a computational model of speech recognition from the physical model

Machine Learning Lab (An application of Matlab or Java-based software to a learning

Midterm review and exam

Take-home midterm exam

Midterm report on Semester Project progress (1-2 pages)

Hidden Markov Models

The famous 3 lectures

HMM Lab: Simple implementations of key HMM algorithms

An introduction to Time-frequency analysis

The Cepstral Transform (CT)

Mel-frequency Cepstral Coefficient Analysis (MFCC)

Signal Processing Lab: Implementation of a signal processing algorithm

Rule-based grammar: CFG (Context Free Grammar)

Stochastic grammar: N-Gram models, Probabilistic Context Free Grammar

Semester Project due at weeks end

Speech Processing Applications

The Semester Project

Big Ideas Some examples of Prior Semester Projects:

Along the way, youll:

Develop the Big Idea into something you can implement

Examples of Past Semester Projects

This course will introduce you to the fundamentals of

The Big Idea

Well begin with a definition of a model. Well then take a

Once we understand the Natural Model, well proceed to

Well develop a hierarchy of the building blocks of speech

These elements are:

The acoustic (audio) elements

Das könnte Ihnen auch gefallen