IB031: Úvod do strojového učení Tomáš Brázdil, Luboš Popelínský, Jaroslav Čechák & his boys Kdo s kým, o cem, proč 2 Kdo s kým, o cem, proč 3 Kdo s kymf o cem, proc Outline ► Supervised learning ► Learning decision trees ► Evaluation ► Pre-processing ► Clustering; Lazy learning ► Anomaly detection ► On machine learning theory ► Probabilistic classifiers ► Linear models ► Kernel Methods ► Neural nets Teaching materials: ISMU 4 Kdo s kým, o cem, proč 5 Organizace ► přednášky ► cvičení 2h ► projekt ► semestrální zkouška ► písemná zkouška Závěrečné hodnocení ► semestrální zkouška 25b. ► projekt 30b. (min 15b.) ► závěrečná zkouška 45b. (min 15b.) ► <50 F, <60 E, <70 D, <80 C, <90 B, >=90 A 40 zápočet 6 Co je strojové učení Herbert Simon (1960s): "Learning is any process by which a system improves performance from experience." Tom Mitchell (1990s, paraphrased): "Learning aims to improve task, T, with respect to performance metric, P, based on experience, E." 7 Co je strojové učení Herbert Simon (1960s): "Learning is any process by which a system improves performance from experience." Tom Mitchell (1990s, paraphrased): "Learning aims to improve task, T, with respect to performance metric, P, based on experience, E. " 8 riklady T: Playing checkers P: Percentage of games won against an arbitrary opponent E: Playing practice games against itself T: Recognizing hand-written words P: Percentage of words correctly classified E: Database of human-labeled images of handwritten words T: Driving on four-lane highways using vision sensors P: Average distance traveled before a human-judged error E: A sequence of images and steering commands recorded w observing a human driver. T: Categorize email messages as spam or legitimate. P: Percentage of email messages correctly classified. E: Database of emails, some with human-given labels riklady T: Playing checkers P: Percentage of games won against an arbitrary opponent E: Playing practice games against itself T: Recognizing hand-written words P: Percentage of words correctly classified E: Database of human-labeled images of handwritten words T: Driving on four-lane highways using vision sensors P: Average distance traveled before a human-judged error E: A sequence of images and steering commands recorded w observing a human driver. T: Categorize email messages as spam or legitimate. P: Percentage of email messages correctly classified. E: Database of emails, some with human-given labels riklady T: Playing checkers P: Percentage of games won against an arbitrary opponent E: Playing practice games against itself T: Recognizing hand-written words P: Percentage of words correctly classified E: Database of human-labeled images of handwritten words T: Driving on four-lane highways using vision sensors P: Average distance traveled before a human-judged error E: A sequence of images and steering commands recorded w observing a human driver. T: Categorize email messages as spam or legitimate. P: Percentage of email messages correctly classified. E: Database of emails, some with human-given labels riklady T: Playing checkers P: Percentage of games won against an arbitrary opponent E: Playing practice games against itself T: Recognizing hand-written words P: Percentage of words correctly classified E: Database of human-labeled images of handwritten words T: Driving on four-lane highways using vision sensors P: Average distance traveled before a human-judged error E: A sequence of images and steering commands recorded w observing a human driver. T: Categorize email messages as spam or legitimate. P: Percentage of email messages correctly classified. E: Database of emails, some with human-given labels Dalsí príklady? 13 Třídy úloh ► shlukování ► klasifikace a predikce ► hledání asociací ► detekce anomálií 14 Historie ► 1950s : Alan Turing and NP-hard problems Samuel's checker player, see Ray Mooney ML Course slides ► 1960s : Neural networks: Perceptron Pattern recognition Learning in the limit theory Minsky and Papert prove limitations of Perceptron ► 1970s : Symbolic concept induction Winston's arch learner Expert systems and the knowledge acquisition bottleneck; Scientific discovery with BACON and AM (math) Quinlan's ID3 Michalski's AQ 15 Historie ► 1980s : Advanced decision tree and rule learning Learning and planning and problem solving Resurgence of neural networks (connectionism, backpropagation) Valiant's PAC Learning Theory Focus on experimental methodology ► 1990s : Data mining Text learning Reinforcement learning (RL) Inductive Logic Programming (ILP) Ensembles: Bagging, Boosting, and Stacking Bayes Net learning Web mining Weka 16 Historie ► 2000s : Support vector machines. Kernel methods Statistical relational learning Graph and Sequence mining, Link learning Privacy-preserving data mining Security (intrusion, virus, and worm detection) Recommender systems; Personalized assistants that I Visual data mining Stream mining RapidMiner R for machine learning Historie ► 2006 : Deep learning ► 2010s : KNIME Big data, Big data, Big data . . Outlier detection and explanation Automated machine learning Deep learning in practice 18