FI:P030 Textual Information Systems - Course Information
P030 Textual Information Systems
Faculty of InformaticsSpring 2002
- Extent and Intensity
- 2/1. 3 credit(s) (plus extra credits for completion). Recommended Type of Completion: zk (examination). Other types of completion: k (colloquium), z (credit).
- Teacher(s)
- doc. RNDr. Petr Sojka, Ph.D. (lecturer)
- Guaranteed by
- doc. Ing. Jan Staudek, CSc.
Department of Computer Systems and Communications – Faculty of Informatics
Contact Person: doc. RNDr. Petr Sojka, Ph.D. - Timetable
- Wed 14:00–15:50 D2
- Timetable of Seminar Groups:
P030/dva: Thu 14:00–14:50 B311
P030/tri: Thu 15:00–15:50 B311
P030/vnouzi: Thu 16:00–16:50 B311 - Prerequisites
- I005 Formal Languages and Automata I || I505 Formal Languages and Automata I
Students are adviced to bring some basic knowledge of automata theory (I005) and natural language processing (I030, I047). Some database basics (P002) is helpfull as well. - Course Enrolment Limitations
- The course is also offered to the students of the fields other than those the course is directly associated with.
- fields of study / plans the course is directly associated with
- Informatics (programme FI, B-IN)
- Informatics (programme FI, M-IN)
- Upper Secondary School Teacher Training in Informatics (programme FI, M-IN)
- Upper Secondary School Teacher Training in Informatics (programme FI, M-SS)
- Information Technology (programme FI, B-IN)
- Syllabus
- Basic notions. TIS - text information system. Classification of information systems.
- Searching in TIS. Searching and pattern matching classification and data structures.
- Algorithms of Knuth-Morris-Pratt, Aho-Corasick. Boyer-Moore, Commentz-Walter, Buczilowski.
- Theory of automata for searching. Classification of searching problems.
- Indexes. Indexing methods. Data structures for searching and indexing.
- Google as an examples of search and indexing engine.
- Signature methods.
- Query languages and document models: boolean, vector, probabilistic, MMM, Paice.
- Data compression. Basic notions. Statistic methods.
- Compression methods based on dictionary. Neural nets for text compression.
- Syntactic methods. Context modelling.
- Spell checking. Filtering information channels. Document classification.
- Literature
- Jaroslav Pokorn\'y, V\'aclav Sn\'a\v{s}el, Du\v{s}an H\'usek: Dokumentografick\'e informa\v{c}n\'{\i} syst\'emy, skripta MFF UK Praha, 1998.
- Information retrieval :data structures & algorithms. Edited by William B. Frakes - Ricardo Baeza-Yates. Upper Saddle River: Prentice Hall, 1992, viii, 504. ISBN 0-13-463837-9. info
- Assessment methods (in Czech)
- Výuka probíhá klasickým zpusobem a je zakončena písemným testem (příklady testů z předchozích let jsou vystaveny na URL předmětu). Na cvičeních dochází k procvičování látky z přednášek a brainstormingu.
- Language of instruction
- Czech
- Follow-Up Courses
- Further comments (probably available only in Czech)
- The course is taught annually.
- Teacher's information
- http://www.fi.muni.cz/~sojka/P030/
- Enrolment Statistics (recent)
- Permalink: https://is.muni.cz/course/fi/spring2002/P030