PřF:M7DataSP Adv. Data Science Practicum - Course Information
M7DataSP Advanced Data Science Practicum
Faculty of ScienceAutumn 2024
- Extent and Intensity
- 0/2/1. 3 credit(s) (příf plus uk k 1 zk 2 plus 1 > 4). Type of Completion: z (credit).
In-person direct teaching - Teacher(s)
- Mgr. Eva Maršálková (lecturer)
Mgr. Petr Šimeček, MSc., Ph.D. (lecturer) - Guaranteed by
- doc. PaedDr. RNDr. Stanislav Katina, Ph.D.
Department of Mathematics and Statistics – Departments – Faculty of Science
Supplier department: Department of Mathematics and Statistics – Departments – Faculty of Science - Timetable of Seminar Groups
- M7DataSP/01: Mon 10:00–11:50 MP1,01014, P. Šimeček
- Prerequisites
- It is expected that students have some experience with a programming language suitable for Data Analysis, e.g. Python or R. The code examples will be given in Python.
- Course Enrolment Limitations
- The course is also offered to the students of the fields other than those the course is directly associated with.
The capacity limit for the course is 30 student(s).
Current registration and enrolment status: enrolled: 30/30, only registered: 2/30, only registered with preference (fields directly associated with the programme): 0/30 - fields of study / plans the course is directly associated with
- Applied Mathematics (programme PřF, N-APM) (2)
- Course objectives
- The main goal is to get hands-on experience with data analysis and machine learning methods. Also to deepen students' programming skills.
- Learning outcomes
- This course will enable students to
- predict dependent variable with linear or logistic regression
- examine unknown data using Principal Component Analysis and/or clustering
- split data into training and testing sets, understand variance vs bias trade-off
- use classification and regression trees, forests, bagging and boosting (XGBoost, LightGBM, CatBoost)
- get basics of pytorch, applying neural networks and fine-tuning to image and NLP data
- get experience with large language models, both trough API and with HuggingFace transformers package
As a side product, after on this course students will also practice
- data cleaning
- visualizations
- data transformation (group by, summary)
- working with git and GitHubem
- working on command line
- reproducible analysis and documents (Jupyter notebook, markdown, quatro)
- social skills, working in groups - Syllabus
- The details can be found on GitHub (also look into materials from previous years) https://github.com/simecek/dspracticum2024
- Teaching methods
- Each lecture will be focused on one dataset and problem on which we demonstrate a new data science skill. Students are expected to submit homework before each lecture.
- Assessment methods
- group homeworks (by group of 2-4 students), extra 30% optional final project (individual). To pass, you must achieve at least 70% points.
- Language of instruction
- Czech
- Further Comments
- Study Materials
- Teacher's information
- https://github.com/simecek/dspracticum2024
- Enrolment Statistics (recent)
- Permalink: https://is.muni.cz/course/sci/autumn2024/M7DataSP