PřF:M7DataSP Adv. Data Science Practicum - Course Information
M7DataSP Advanced Data Science Practicum
Faculty of ScienceAutumn 2023
- Extent and Intensity
- 0/2/1. 3 credit(s) (příf plus uk k 1 zk 2 plus 1 > 4). Type of Completion: z (credit).
- Teacher(s)
- Mgr. Eva Maršálková (lecturer)
Mgr. Petr Šimeček, MSc., Ph.D. (lecturer)
Mgr. Denisa Šrámková (lecturer) - Guaranteed by
- doc. PaedDr. RNDr. Stanislav Katina, Ph.D.
Department of Mathematics and Statistics – Departments – Faculty of Science
Supplier department: Department of Mathematics and Statistics – Departments – Faculty of Science - Timetable of Seminar Groups
- M7DataSP/01: Mon 8:00–9:50 MP1,01014, P. Šimeček
- Prerequisites
- It is expected that students have some experience with a programming language suitable for Data Analysis, e.g. Python or R. The code examples will be given in Python.
- Course Enrolment Limitations
- The course is also offered to the students of the fields other than those the course is directly associated with.
The capacity limit for the course is 30 student(s).
Current registration and enrolment status: enrolled: 6/30, only registered: 0/30, only registered with preference (fields directly associated with the programme): 0/30 - fields of study / plans the course is directly associated with
- Applied Mathematics (programme PřF, N-APM) (2)
- Course objectives
- The main goal is to get hands-on experience with data analysis and machine learning methods. Also to deepen students' programming skills.
- Learning outcomes
- This course will enable students to
- predict dependent variable with linear or logistic regression
- examine unknown data using Principal Componen Analysis and/or clustering
- split data into training and testing sets, understand variance vs bias trade-off
- use classification and regression trees, forests, bagging and boosting (XGBoost, LightGBM, CatBoost)
- get basics of TensorFlow 2.0 and Keras, applying neural networks and fine-tuning to image and NLP data
- large language models
- recommendation algorithms (collaboration filtering)
As a side product, after on this course students will also practice
- data cleaning
- visualizations
- data transformation (group by, summary)
- working with git and GitHubem
- reproducible analysis and documents (Jupyter notebook, markdown, quatro)
- social skills, working in groups - Syllabus
- The details can be found on GitHub https://github.com/simecek/dspracticum2023
- Teaching methods
- Each lecture will be focused on one dataset and problem on which we demonstrate a new data science skill. Students are expected to submit homework before each lecture.
- Assessment methods
- group homeworks (by group of 2-4 students), extra 30% optional final project (individual). To pass, you must achieve at least 70% points.
- Language of instruction
- Czech
- Further Comments
- Study Materials
- Teacher's information
- https://github.com/simecek/dspracticum2023
- Enrolment Statistics (Autumn 2023, recent)
- Permalink: https://is.muni.cz/course/sci/autumn2023/M7DataSP