CJBB75 Using a language corpus - elementary skills

Faculty of Arts
Spring 2022
Extent and Intensity
0/2/0. 3 credit(s). Recommended Type of Completion: k (colloquium). Other types of completion: zk (examination).
Teacher(s)
doc. PhDr. Klára Osolsobě, Dr. (lecturer)
Guaranteed by
doc. PhDr. Klára Osolsobě, Dr.
Department of Czech Language – Faculty of Arts
Contact Person: Jaroslava Vybíralová
Supplier department: Department of Czech Language – Faculty of Arts
Timetable
Mon 10:00–11:40 G13
Prerequisites
Prerequisite: CJA001 Úvod do studia českého jazyka or another university course covering an introduction to linguistics and language Studies
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
The capacity limit for the course is 20 student(s).
Current registration and enrolment status: enrolled: 5/20, only registered: 0/20, only registered with preference (fields directly associated with the programme): 0/20
fields of study / plans the course is directly associated with
Course objectives
The seminar focuses on development of elementary corpus usage skills.
Learning outcomes
Student will be able to employ language corpora by dealing with the elementary linguistic problems. Student will be able to create queries in cql. Student will be able to observe corpus data and to make generalizations about language.
Syllabus
  • 1. KonText - search engine, the full access to the Czech National Corpus - CNC (agreement and registration). 2. Available Corpora of CNC. 2. How to search in corpora. 3. Morphological tagging (word, lemma, part of speech, ...). 4. Processing of the found data (alphabetical classification etc.). 5. Grammar and corpora - data observation and data mining. 6. Corporus evidence versus grammatical rule. 7. Problem solving and its presentation: How to search in untagged corpus. 8. Problem solving and its presentation: Formal morphology. 9. Problem solving and its presentation: Word formation. 10. Problem solving and its presentation: Lexicon. 11. Problem solving and its presentation: Morphological variants. 12. Problem solving and its presentation: Concluding from the corpus evidence.
Literature
  • ŠULC, Michal. Korpusová lingvistika : první vstup. 1. vyd. Praha: Karolinum, 1999, 94 s. ISBN 8071848476. info
  • Jak využívat Český národní korpus. Edited by František Čermák - Renata Blatná. Vydání první. Praha: NLN, Nakladatelství Lidové noviny, 2005, 181 stran. ISBN 8071067369. info
  • Český národní korpus :úvod a příručka uživatele. Edited by Jan Kocek - Marie Kopřivová - Karel Kučera. Vyd. 1. Praha: Filozofická fakulta UK - Ústav Českého národního korpusu, 2000, 156 s. ISBN 80-85899-94-9. info
Teaching methods
A practical introduction to corpora usage will be followed by a set of exemplary exercises, homeworks and follow-up class discussion.
Assessment methods
Final project: problem solving and its presentation. During the course every student will hand in 5 homeworks (1–3 pages). Test (cql).
Language of instruction
Czech
Follow-Up Courses
Further comments (probably available only in Czech)
Study Materials
The course is taught annually.
General note: Před návštěvou CJBB75 je doporučeno absolvovat CJBB105 // oba předměty navštěvovat současně.
Information on course enrolment limitations: Předmět je povinný pro studenty Č. jazyka se specializací počítač. lingvistika, tito dostávají při zápisu přednost.
The course is also listed under the following terms Spring 2004, Autumn 2004, Spring 2005, Spring 2006, Autumn 2006, Spring 2007, Autumn 2007, Spring 2008, Autumn 2008, Spring 2009, Autumn 2009, Spring 2010, Autumn 2010, Spring 2011, Autumn 2011, Spring 2012, Autumn 2012, Spring 2013, Spring 2014, Spring 2015, Spring 2016, Spring 2017, Spring 2018, Autumn 2018, Spring 2019, Spring 2020, Spring 2021, Spring 2023, Spring 2024, Spring 2025.
  • Enrolment Statistics (Spring 2022, recent)
  • Permalink: https://is.muni.cz/course/phil/spring2022/CJBB75