FI:PA164 Learning and natural language - Course Information
PA164 Machine learning and natural language processing
Faculty of InformaticsAutumn 2023
- Extent and Intensity
- 2/1/0. 3 credit(s) (plus extra credits for completion). Recommended Type of Completion: zk (examination). Other types of completion: z (credit).
- Teacher(s)
- doc. Mgr. Bc. Vít Nováček, PhD (lecturer)
- Guaranteed by
- doc. Mgr. Bc. Vít Nováček, PhD
Department of Machine Learning and Data Processing – Faculty of Informatics
Supplier department: Department of Machine Learning and Data Processing – Faculty of Informatics - Timetable
- Mon 14:00–15:50 A318
- Timetable of Seminar Groups:
- Prerequisites
- The basics of machine learning (e.g. IB031), computational linguistics (e.g. PA153) and neural networks (e.g. PV021), is assumed. The course is given in English (or in Czech depending on the audience). Task solutions can be in English, Czech or Slovak (exceptionally in another language).
- Course Enrolment Limitations
- The course is also offered to the students of the fields other than those the course is directly associated with.
- fields of study / plans the course is directly associated with
- Image Processing and Analysis (programme FI, N-VIZ)
- Applied Informatics (programme FI, N-AP)
- Information Technology Security (eng.) (programme FI, N-IN)
- Information Technology Security (programme FI, N-IN)
- Bioinformatics and systems biology (programme FI, N-UIZD)
- Bioinformatics (programme FI, N-AP)
- Computer Games Development (programme FI, N-VIZ_A)
- Computer Graphics and Visualisation (programme FI, N-VIZ_A)
- Computer Networks and Communications (programme FI, N-PSKB_A)
- Cybersecurity Management (programme FI, N-RSSS_A)
- Czech Language with Orientation on Computational Linguistics (programme FF, B-FI)
- Discrete algorithms and models (programme FI, N-TEI)
- Formal analysis of computer systems (programme FI, N-TEI)
- Graphic design (programme FI, N-VIZ)
- Graphic Design (programme FI, N-VIZ_A)
- Hardware Systems (programme FI, N-PSKB_A)
- Hardware systems (programme FI, N-PSKB)
- Image Processing and Analysis (programme FI, N-VIZ_A)
- Information security (programme FI, N-PSKB)
- Information Systems (programme FI, N-IN)
- Informatics (eng.) (programme FI, D-IN4)
- Informatics (programme FI, D-IN4)
- Informatics (programme FI, N-IN)
- Information Security (programme FI, N-PSKB_A)
- Quantum and Other Nonclassical Computational Models (programme FI, N-TEI)
- Parallel and Distributed Systems (programme FI, N-IN)
- Computer graphics and visualisation (programme FI, N-VIZ)
- Computer Graphics (programme FI, N-IN)
- Computer Networks and Communication (programme FI, N-IN)
- Computer Networks and Communications (programme FI, N-PSKB)
- Computer Systems and Technologies (eng.) (programme FI, D-IN4)
- Computer Systems and Technologies (programme FI, D-IN4)
- Computer Systems (programme FI, N-IN)
- Principles of programming languages (programme FI, N-TEI)
- Embedded Systems (eng.) (programme FI, N-IN)
- Embedded Systems (programme FI, N-IN)
- Cybersecurity management (programme FI, N-RSSS)
- Services development management (programme FI, N-RSSS)
- Software Systems Development Management (programme FI, N-RSSS)
- Services Development Management (programme FI, N-RSSS_A)
- Service Science, Management and Engineering (eng.) (programme FI, N-AP)
- Service Science, Management and Engineering (programme FI, N-AP)
- Social Informatics (programme FI, B-AP)
- Software Systems Development Management (programme FI, N-RSSS_A)
- Software Systems (programme FI, N-PSKB_A)
- Software systems (programme FI, N-PSKB)
- Machine learning and artificial intelligence (programme FI, N-UIZD)
- Theoretical Informatics (programme FI, N-IN)
- Upper Secondary School Teacher Training in Informatics (programme FI, N-SS) (2)
- Artificial Intelligence and Natural Language Processing (programme FI, N-IN)
- Computer Games Development (programme FI, N-VIZ)
- Processing and analysis of large-scale data (programme FI, N-UIZD)
- Image Processing (programme FI, N-AP)
- Natural language processing (programme FI, N-UIZD)
- Course objectives
- Students will obtain knowledge about methods and tools for text mining and natural language learning. At the end of the course students should be able to create systems for text analysis by machine learning methods. Students are able to understand, explain and exploit contents of scientific papers from this area.
- Learning outcomes
- A student will be able
- to pre-process text data for text mining;
- to build a system for analysis of text by means of machine learning;
- to understand research papers from this area;
- to write a technical report. - Syllabus
- Course overview, a sample text (pre)processing pipeline
- Quick and dirty intro to ML
- Distributional semantics, LSA, word embeddings
- Deep neural networks for NLP
- Language models and their applications
- AutoML for NLP
- Student poster session(s), including extensive feedback during the students' work and its presentation
- Application example: sentiment analysis
- Application example: knowledge extraction from text
- Guest lecture(s) from international experts on various ML applications in the NLP field
- Final project presentations
- Literature
- recommended literature
- Chang, Yupeng, et al. "A survey on evaluation of large language models." ACM Transactions on Intelligent Systems and Technology 15.3 (2024): 1-45.
- Zhao, Wayne Xin, et al. "A survey of large language models." arXiv preprint arXiv:2303.18223 (2023).
- Charu C. Aggarwal, Machine Learning for Text. Springer 2018
- Mining text data. Edited by Charu C. Aggarwal - ChengXiang Zhai. New York: Springer Science+Business Media, 2012, xi, 522. ISBN 9781461432227. info
- MANNING, Christopher D. and Hinrich SCHÜTZE. Foundations of statistical natural language processing. Cambridge: MIT Press, 1999, xxxvii, 68. ISBN 0-262-13360-1. info
- Teaching methods
- a lecture combined with independent work on and demonstrations of selected techniques in the labs, work on a project
- Assessment methods
- Oral examination with written preps (optional). Project presentations are a part of the examination.
- Language of instruction
- English
- Further Comments
- Study Materials
The course is taught annually.
- Enrolment Statistics (Autumn 2023, recent)
- Permalink: https://is.muni.cz/course/fi/autumn2023/PA164