Lecturer(s)
|
-
Bukovský Ivo, doc. Ing. Ph.D.
-
Hrubý Filip, Mgr. M.Sc.
-
Skrbek Miroslav, Ing. Ph.D.
|
Course content
|
1. Introduction, goals of data mining, knowledge mining process 2. Data sources, data types, methods and formats for data storage 3. Statistics: mean, variance, median, correlation, normal distribution 4. Advanced datamining tool, basic principles, simple project creation 5. Preprocessing: data normalizing, feature extraction from data, text documents, web pages and images 6. Dimension reduction of data, Principal Component Analysis, feature ranking and feature selection 7. Similarity measures and cluster analysis 8. Simple models of data: linear and logistic regression 9. Data modeling: decision trees, association rules 10. Classifiers: k-NN, Naive Bayes classifier 11. Model evaluation and testing 12. Advanced modeling methods 13. Result interpretation and reporting A set of practical tasks covering the lecture topics.
|
Learning activities and teaching methods
|
Monologic (reading, lecture, briefing), Work with multi-media resources (texts, internet, IT technologies), Project-based learning, Practical training, Case studies
- Preparation for classes
- 20 hours per semester
- Semestral paper
- 24 hours per semester
- Class attendance
- 56 hours per semester
- Preparation for exam
- 25 hours per semester
|
Learning outcomes
|
The aim of the course is to teach students the basis of data mining directed to bioinformatics. The course provides topics covering the complete process of data mining: data acquisition, data pre-processing, data analysis, knowledge extraction, data visualization and reporting. . Students will learn the most commonly used principles and algorithms. In exercises, students will acquire practical data mining skills using simple table-type tools and a sophisticated datamining tool.
Working with data on a PC, the ability to use data mining, mathematical analysis and problem solving using programming tools.
|
Prerequisites
|
unspecified
|
Assessment methods and criteria
|
Test, Seminar work
Fulfill semestral tasks, pass the exam test and get at least 50% of all possible points.
|
Recommended literature
|
-
Berka, P. Dobývání znalostí z databází. Academia, 2003. ISBN 80-200-1062-9.
-
Ethem Alpaydin. Introduction to Machine Learning.The MIT Press; fourth edition, March 24, 2020. ISBN 978-0262043793.
-
Kris Jamsa. Introduction to Data Mining and Analytics. Jones & Bartlett Learning, February 17, 2020. ISBN 978-1284180909.
-
Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, Vipin Kumar. Introduction to Data Mining (2nd edition). 2018. ISBN 978-0133128901.
|