PB106 Corpus Linguistic Project I

Faculty of Informatics
Autumn 2022
Extent and Intensity
0/2. 2 credit(s) (plus extra credits for completion). Type of Completion: z (credit).
Teacher(s)
doc. Mgr. Pavel Rychlý, Ph.D. (lecturer)
Guaranteed by
doc. Mgr. Pavel Rychlý, Ph.D.
Department of Machine Learning and Data Processing – Faculty of Informatics
Supplier department: Department of Machine Learning and Data Processing – Faculty of Informatics
Timetable
Tue 10:00–11:50 B411
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
fields of study / plans the course is directly associated with
there are 66 fields of study the course is directly associated with, display
Course objectives
The aim of the seminar is to provide students with a deeper knowledge concerning a chosen area of corpus linguistics and practical checking of this knowledge by working on the project. The popularisation of corpus linguistics and other areas of language engineering is one of the main goals of Natural Language Processing Centre at the Faculty of Informatics.
Fundamental information about the Natural Language Processing Centre and corpus linguistics in general can be found at http://nlp.fi.muni.cz/.
Learning outcomes
Student will be able to: create a text corpus from different sources; use automatic tools for corpus annotation or information extraction; evaluate accuracy of automatic tools; present evaluation results.
Syllabus
  • theme introduction: text corpora, parallel corpora, annotation, statistics, user interfaces
  • project selection
  • work on a project
  • presentation of project results and discussion
Literature
  • EISENSTEIN, Jacob. Introduction to natural language processing. Cambridge, Massachusetts: MIT Press, 2019, xiv, 519. ISBN 9780262042840. info
  • JURAFSKY, Dan and James H. MARTIN. Speech and language processing : an introduction to natural language processing, computational linguistics and speech recognition. 2nd ed. New Jersey: Pearson, 2009, 1024 s. ISBN 9780135041963. info
  • JACKSON, Peter and Isabelle MOULINIER. Natural language processing for online applications : text retrieval, extraction and categorization. Amsterdam: John Benjamins Publishing Company, 2002, x, 225. ISBN 902724989X. info
  • MANNING, Christopher D. and Hinrich SCHÜTZE. Foundations of statistical natural language processing. Cambridge: MIT Press, 1999, xxxvii, 68. ISBN 0-262-13360-1. info
  • Corpus processing for lexical acquisition. Edited by Bran Boguraev - J. (James) Pustejovsky. Cambridge: Bradford Book, 1996, xi, 245 s. ISBN 0-262-02392-X. info
Teaching methods
lectures, work on individual project, personal consultation, presentation
Assessment methods
Project. Evaluation based on presentation of project results.
Language of instruction
Czech
Further Comments
Study Materials
The course is taught annually.
The course is also listed under the following terms Autumn 2002, Autumn 2003, Autumn 2004, Autumn 2005, Autumn 2006, Autumn 2007, Autumn 2008, Autumn 2009, Autumn 2010, Autumn 2011, Autumn 2012, Autumn 2013, Autumn 2014, Autumn 2015, Autumn 2016, Autumn 2017, Autumn 2018, Autumn 2019, Autumn 2020, Autumn 2021, Autumn 2023, Autumn 2024.
  • Enrolment Statistics (Autumn 2022, recent)
  • Permalink: https://is.muni.cz/course/fi/autumn2022/PB106