FI:P030 Textual Information Systems - Course Information
P030 Textual Information Systems
Faculty of InformaticsSpring 2001
- Extent and Intensity
- 2/1. 3 credit(s) (plus extra credits for completion). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), z (credit).
- Teacher(s)
- doc. RNDr. Petr Sojka, Ph.D. (lecturer)
- Guaranteed by
- doc. Ing. Jan Staudek, CSc.
Department of Computer Systems and Communications – Faculty of Informatics
Contact Person: doc. RNDr. Petr Sojka, Ph.D. - Timetable
- Mon 10:00–11:50 A107
- Timetable of Seminar Groups:
P030/02: Mon 14:00–15:50 B204, P. Sojka - Prerequisites
- I005 Formal Languages and Automata I
Students are adviced to bring some basic knowledge of automata theory (I005) and natural language processing (I030, I047). Some database basics (P002) is helpfull as well. - Course Enrolment Limitations
- The course is also offered to the students of the fields other than those the course is directly associated with.
- fields of study / plans the course is directly associated with
- Informatics (programme FI, B-IN)
- Informatics (programme FI, M-IN)
- Upper Secondary School Teacher Training in Informatics (programme FI, M-IN)
- Upper Secondary School Teacher Training in Informatics (programme FI, M-SS)
- Information Technology (programme FI, B-IN)
- Syllabus
- Basic notions. TIS - text information system. Classification of information systems.
- Searching in TIS. Searching and pattern matching classification and data structures.
- Algorithms of Knuth-Morris-Pratt, Aho-Corasick. Boyer-Moore, Commentz-Walter, Buczilowski.
- Theory of automata for searching. Classification of searching problems.
- Indexes. Indexing methods. Data structures for searching and indexing.
- Google as an examples of search and indexing engine.
- Signature methods.
- Query languages and document models: boolean, vector, probabilistic, MMM, Paice.
- Data compression. Basic notions. Statistic methods.
- Compression methods based on dictionary. Neural nets for text compression.
- Syntactic methods. Context modelling.
- Spell checking. Filtering information channels. Document classification.
- Literature
- KORFHAGE, Robert R. Information storage and retrieval. New York: Wiley Computer Publishing, 1997, xiii, 349. ISBN 0471143383. info
- WITTEN, Ian H., Alistair MOFFAT and Timothy C. BELL. Managing gigabytes :compressing and indexing documents and images. New York: Van Nostrand Reinhold, 1994, xiv, 429 s. ISBN 0-442-01863-0. info
- Information retrieval :data structures & algorithms. Edited by William B. Frakes - Ricardo Baeza-Yates. Upper Saddle River: Prentice Hall, 1992, viii, 504. ISBN 0-13-463837-9. info
- Jaroslav Pokorn\'y, V\'aclav Sn\'a\v{s}el, Du\v{s}an H\'usek: Dokumentografick\'e informa\v{c}n\'{\i} syst\'emy, skripta MFF UK Praha, 1998.
- Assessment methods (in Czech)
- Výuka probíhá klasickým zpusobem a je zakončena písemným testem (příklady testů z předchozích let jsou vystaveny na URL předmětu). Na cvičeních dochází k procvičování látky z přednášek a zpracování týmového projektu.
- Language of instruction
- Czech
- Further Comments
- The course is taught annually.
- Teacher's information
- http://www.fi.muni.cz/~sojka/tis/
- Enrolment Statistics (Spring 2001, recent)
- Permalink: https://is.muni.cz/course/fi/spring2001/P030