Lecturer(s)
|
|
Course content
|
Lectures: 1. The concept of data mining; 2. Statistical methods used in concepts of data mining 3. Application of statistical methods on examples 4. Linear and non linear regressions 5. Multi-dimensional models, categorical variables, interactive term 6. Learning with supervisor, decision trees 7. Bayes naive classifier 8. Method of k-nearest neighbours 9. Association rules (method of shopping cart), decision rules 10. Cluster analysis - different methods and approaches 11. Simple neural network 12. Convolution neural network, classification and prediction from images 13. Solving simple DM tasks using different methods 14. Combination of models, presentation of seminar works
|
Learning activities and teaching methods
|
Monologic (reading, lecture, briefing), Demonstration, E-learning
- Class attendance
- 42 hours per semester
- Preparation for classes
- 46 hours per semester
- Preparation for exam
- 45 hours per semester
- Preparation for credit
- 35 hours per semester
|
Learning outcomes
|
The aim of the subject is to acquaint students with knowledge of different types of enterprise data resources and the use of collected data via different models. Students will get the experience from different parts of data mining including the basic terms and they will get overview about different machine learning methods. The methods will be explained and clarified using simple examples and they will be shown using calculations in Excel but also using simple programming structures in Python where prepared libraries can be used for process of modelling and validation.
Students learn the skills needed to use tools for data mining. Students will know the theoretical foundations of data mining, but also know how to apply them in practice.
|
Prerequisites
|
Prerequisities: KMI/DBS1 Database systems, KMI/TPS2 or KMI/TPS2A Theory of Probability and Statistics 2
|
Assessment methods and criteria
|
Combined exam, Seminar work
Requirements for students: Evaluation will be realized through one test (exam test).
|
Recommended literature
|
-
Anděl, J. Statistické metody 3. vyd., Praha, Marfyzpress.2003.ISBN 80-86732-08-8.
-
Berka, P. Dobývání znalostí z databází.. Praha: Academia, 2003. ISBN 80-200-1062-9.
-
Hendl, J. Přehled statistických metod zpracování dat.. Praha: Portál, 2006. ISBN 80-7367-123-9.
-
Humphries, M., Hawkins, W.,M., Dy. M.C. Data warehousing Návrh a implementace. Computer Press, 2002. ISBN 8072265601.. Computer Press, 2002. ISBN 8072265601.
-
LACKO, M. Databáze: datové sklady, OLAP a dolování dat. Computer Press, 2003. ISBN 80-7226-969-0.. Computer Press, 2003. ISBN 80-7226-969-0.
-
Venables, W., N., Ripley, B.D. Modern Applied Statistics with S. New York : 4th ed, 2002. ISBN 0-387-95457-0.
-
Weka 3. Data Mining Software in Java [online].. 1998.
|