FI:PV158 Speech signal processing

PV158 Speech signal processing

Faculty of Informatics
Autumn 2002

Extent and Intensity

2/1. 2 credit(s) (plus extra credits for completion). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), z (credit).

Teacher(s)

prof. Dr. Ing. Jan Černocký (lecturer), doc. RNDr. Ivan Kopeček, CSc. (deputy)

Guaranteed by

prof. PhDr. Karel Pala, CSc.
Department of Machine Learning and Data Processing – Faculty of Informatics
Contact Person: doc. RNDr. Ivan Kopeček, CSc.

Timetable

Thu 10:00–11:50 B007 and each odd Thursday 12:00–13:50 B117

Course Enrolment Limitations

The course is also offered to the students of the fields other than those the course is directly associated with.

fields of study / plans the course is directly associated with

Applied Informatics (programme FI, B-AP)
Applied Informatics (programme FI, N-AP)
Informatics (programme FI, B-IN)
Informatics (programme FI, M-IN)
Informatics (programme FI, N-IN)
Information Technology (programme FI, B-IN)

Course objectives

Applications of speech processing, digital processing of speech signals, production and perception of speech, introduction to phonetics, pre-processing and basic parameters of speech, linear-predictive model, cepstrum, fundamental frequency estimation, coding - time domain and vocoders, recognition - DTW and HMM

Syllabus

Informational contents of written and spoken form of speech.
Techniques of signal processing applied to speech: Fourier transform, z-transform, linear filtering.
Time domain and frequency domain behavoir of linear systems.
Signal processing model of speech production.
Excitation and filter.
Determination of parameters using linear prediction.
LPC coefficients and derived parameters (PARCOR, LAR,...).
Speech analysis using short-time Fourier transform (STFT): filter-bank interpretation, computation using fast Fourier transform (FFT).
Cepstral analysis.
Parameterization with perceptually warped frequency axis.
Fundamental frequency determination.
Features for speech processing, criteria of choice.
Measures of similarity between speech segments.
Speech coding: waveform and parametric vocoders.
Excitation modeling (CELP).
Phonetic vocoders.
Speech recognition: Hidden Markov Models (HMM).
HMM training and HMM decoding.
Extension of HMMs to continuous speech recognition.
Statistical language models.
The studied methods are experimentally exercised in computer laboratories (Matlab).

Literature

PSUTKA, Josef. Komunikace s počítačem mluvenou řečí. Praha: Academia, 1995, 287 s. ISBN 8020002030. info
RABINER, Lawrence R. and Biing-Hwang JUANG. Fundamentals of speech recognition. Englewood Cliffs: Prentice Hall PTR, 1993, xxxv, 507. ISBN 0-13-015157-2. info

Assessment methods (in Czech)

tydne 2h prednaska. 2h pocitacovych cviceni 1x za 14 dni. Maly domaci projekt, presentace na posledni prednasce. Test v poc. laboratorich, pisemna zkouska.

Language of instruction

Czech

Further Comments

The course is taught annually.

Teacher's information

http://www.fee.vutbr.cz/~cernocky/Students.html

The course is also listed under the following terms Autumn 2003, Autumn 2004, Autumn 2005, Spring 2007, Spring 2008.

Enrolment Statistics (Autumn 2002, recent)
Permalink: https://is.muni.cz/course/fi/autumn2002/PV158

FI:PV158 Speech signal processing - Course Information

PV158 Speech signal processing

Other applications