M8DM1 Data mining I

Faculty of Science
Spring 2019
Extent and Intensity
2/2/0. 4 credit(s) (fasci plus compl plus > 4). Type of Completion: zk (examination).
Teacher(s)
RNDr. Radim Navrátil, Ph.D. (lecturer)
Mgr. Jan Böhm (seminar tutor)
Guaranteed by
doc. PaedDr. RNDr. Stanislav Katina, Ph.D.
Department of Mathematics and Statistics – Departments – Faculty of Science
Contact Person: RNDr. Radim Navrátil, Ph.D.
Supplier department: Department of Mathematics and Statistics – Departments – Faculty of Science
Timetable
Mon 18. 2. to Fri 17. 5. Thu 14:00–15:50 M1,01017
  • Timetable of Seminar Groups:
M8DM1/01: Mon 18. 2. to Fri 17. 5. Thu 8:00–9:50 MP1,01014, R. Navrátil
M8DM1/02: Mon 18. 2. to Fri 17. 5. Thu 10:00–11:50 MP1,01014, R. Navrátil
M8DM1/03: Mon 18. 2. to Fri 17. 5. Thu 18:00–19:50 MP1,01014, J. Böhm
Prerequisites (in Czech)
Základy lineární algebry a maticového počtu.
Základní znalosti matematického modelování.
Základní znalosti matematické statistiky.
Znalost lineárních regresních modelů.
Course Enrolment Limitations
The course is also offered to the students of the fields other than those the course is directly associated with.
fields of study / plans the course is directly associated with
Course objectives
Data mining is a proven way how to get best knowledge from data for decision making. The course is an introduction to data mining issues, definitions of basic concepts, an introduction and practice of the methods and techniques that are used in practice. Students will gain a basic knowledge of these methods. On computer exercises they will learn to work with statistical software SAS and apply presented methods on real data.
Learning outcomes
After completing this course students will be able to perform: (1) data preprocessing; (2) exploratory data analysis and data vizualization; (3) descriptive data modeling; (4) predictive data modeling.
Syllabus
  • History of data mining, basic concepts, software.
  • Data organization.
  • Data preparation.
  • Exploratory analysis, visualization, contingency tables.
  • Dimension reduction - principal components, factor analysis, multidimensional scaling
  • Market basket analysis.
  • Cluster analysis.
  • Linear regression, assumptions violation, robustification.
  • Logistic regression.Model evaluation– LC (ROC), Gini, KS, Lift.
  • Decision trees.
Literature
  • GIUDICI, Paolo. Applied data mining : statistical methods for business and industry. Chichester: Wiley, 2003, xii, 364. ISBN 0470846798. info
  • HAN, Jiawei and Micheline KAMBER. Data mining : concepts and techniques. 2nd ed. San Francisco, CA: Morgan Kaufmann, 2006, xxviii, 77. ISBN 1558609016. URL info
  • HAND, D. J., Heikki MANNILA and Padhraic SMYTH. Principles of data mining. Cambridge, Mass.: MIT Press, 2001, xxxii, 546. ISBN 026208290X. info
  • Business modeling and data mining. Edited by Dorian Pyle. Boston: Morgan Kaufmann Publishers, 2003, xxvi, 693. ISBN 155860653X. info
  • Data mining and knowledge discovery handbook. Edited by Oded Z. Maimon - Lior Rokach. New York: Springer, 2005, xxxv, 1383. ISBN 0387244352. info
Teaching methods
Lectures - gaing knowledge of data mining techniques. Exercises - practice of data mining techniques with the aid of statistical software SAS.
Assessment methods
Computer test on exercises - 50% points is needed to pass. Oral exam - 50% of correct answers and correctly solved project are needed to pass.
Language of instruction
Czech
Further Comments
Study Materials
The course is taught annually.
Listed among pre-requisites of other courses
The course is also listed under the following terms Spring 2011 - only for the accreditation, Spring 2011, Spring 2012, spring 2012 - acreditation, Spring 2013, Spring 2014, Spring 2015, Spring 2016, Spring 2017, spring 2018, Spring 2020, Spring 2021, Spring 2022, Spring 2023, Spring 2024, Spring 2025.
  • Enrolment Statistics (Spring 2019, recent)
  • Permalink: https://is.muni.cz/course/sci/spring2019/M8DM1