FI:PV158 Speech signal processing

PV158 Speech signal processing

Faculty of Informatics
Autumn 2003

Extent and Intensity

2/1. 2 credit(s) (plus extra credits for completion). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), z (credit).

Teacher(s)

prof. Dr. Ing. Jan Černocký (lecturer), doc. RNDr. Ivan Kopeček, CSc. (deputy)

Guaranteed by

prof. PhDr. Karel Pala, CSc.
Department of Machine Learning and Data Processing – Faculty of Informatics
Contact Person: doc. RNDr. Ivan Kopeček, CSc.

Timetable

Thu 8:00–9:50 B411, Thu 10:00–10:50 B117

Course Enrolment Limitations

The course is also offered to the students of the fields other than those the course is directly associated with.

fields of study / plans the course is directly associated with

there are 6 fields of study the course is directly associated with, display

Course objectives

Applications of speech processing, digital processing of speech signals, production and perception of speech, introduction to phonetics, pre-processing and basic parameters of speech, linear-predictive model, cepstrum, fundamental frequency estimation, coding - time domain and vocoders, recognition - DTW and HMM

Syllabus

Informational contents of written and spoken form of speech.
Techniques of signal processing applied to speech: Fourier transform, z-transform, linear filtering.
Time domain and frequency domain behavoir of linear systems.
Signal processing model of speech production.
Excitation and filter.
Determination of parameters using linear prediction.
LPC coefficients and derived parameters (PARCOR, LAR,...).
Speech analysis using short-time Fourier transform (STFT): filter-bank interpretation, computation using fast Fourier transform (FFT).
Cepstral analysis.
Parameterization with perceptually warped frequency axis.
Fundamental frequency determination.
Features for speech processing, criteria of choice.
Measures of similarity between speech segments.
Speech coding: waveform and parametric vocoders.
Excitation modeling (CELP).
Phonetic vocoders.
Speech recognition: Hidden Markov Models (HMM).
HMM training and HMM decoding.
Extension of HMMs to continuous speech recognition.
Statistical language models.
The studied methods are experimentally exercised in computer laboratories (Matlab).

Literature

PSUTKA, Josef. Komunikace s počítačem mluvenou řečí. Praha: Academia, 1995, 287 s. ISBN 8020002030. info
RABINER, Lawrence R. and Biing-Hwang JUANG. Fundamentals of speech recognition. Englewood Cliffs: Prentice Hall PTR, 1993, xxxv, 507. ISBN 0-13-015157-2. info

Assessment methods (in Czech)

Výuka: týdně 2h přednáška, 1x 14 dní 2h počítačových cvičení Matlab.
Podmínky pro ukončení kursu: zápočet - úspěšně absolvovaný test v posledním počítačovém cvičení, kolokvium - úspěšně absolvovaný test v posledním počítačovém cvičení A odevzdaný a presentovaný domácí projekt, zkouška - úspěšně absolvovaný test v posledním počítačovém cvičení A odevzdaný a presentovaný domácí projekt A písemná zkouška.
test v počítačovém cvičení - několik jednoduchých příkladů v Matlabu, k disposici libovolné poznámky, literatura, všechny vytvořené funkce. Max. 20 bodů ke zkoušce.
domácí projekt - na výběr z témat na http://www.fit.vutbr.cz/~cernocky/speech/projekty.html registrace během celého semestru, krátká písemná zpráva (4 strany A4), může být i ručně psaná, ústní presentace (10 min.) na poslední přednášce. Max. 20 bodů ke zkoušce.
písemná zkouška - k disposici veškerá literatura a výpočtní technika, . 2 hodiny, 5 teoretických otázek, 5 početních příkladů, po 6 bodech. Max. 60 bodů.
Hodnocení v případě ukončení zkouškou: celkem 100 bodů: 0-39 bodů: 4, 40-59 bodů: 3, 60-79 bodů: 2, 80-100 bodů: 1

Language of instruction

Czech

Further Comments

The course is taught annually.

Teacher's information

http://www.fit.vutbr.cz/~cernocky/speech

The course is also listed under the following terms Autumn 2002, Autumn 2004, Autumn 2005, Spring 2007, Spring 2008.

Enrolment Statistics (Autumn 2003, recent)
Permalink: https://is.muni.cz/course/fi/autumn2003/PV158

FI:PV158 Speech signal processing - Course Information

PV158 Speech signal processing

Other applications