Course: Advanced data storages and analyses

» List of faculties » FBI » UAI
Course title Advanced data storages and analyses
Course code UAI/504
Organizational form of instruction Lecture + Lesson
Level of course Master
Year of study not specified
Frequency of the course In each academic year, in the winter semester.
Semester Winter
Number of ECTS credits 6
Language of instruction English
Status of course Compulsory
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Course availability The course is available to visiting students
Lecturer(s)
  • Geyer Jakub, Mgr.
  • Bukovský Ivo, doc. Ing. Ph.D.
  • Prokýšek Miloš, PhDr. Ph.D.
  • Budík Ondřej, Ing.
Course content
1. Relational and NoSQL data storages 2. Datawarehouse     a. Star, Snowflake and Data Vault patterns     b. ETL, OLAP, OLTP 3. Distributed database systems     a. CAP theorem     b. Master-slave, mirroring, sharding 4. NoSQL database systems     a. Key-value     b. Column oriented     c. Document databases     d. Graph databases     e. Time-series databases 5. Large datasets     a. Velocity, variability, volume     b. Unstructured data     c. ELT processing, curated data 6. Stream data processing     a. Buffering     b. Distribution     c. Storing     d. Real-time processing 7. Data mining     a. Data sources and datatypes     b. Data matrix     c. Data storages 8. Similarity measurement, methods of cluster analysis 9. Basic data models     a. Linear and log-linear regression 10. Data modelling     a. Decision trees, association rules 11. Classificatory     a. k-NN     b. naive bayes classifier 12. Data lakes     a. Distributed filesystems     b. Hadoop - family solutions

Learning activities and teaching methods
Monologic (reading, lecture, briefing)
  • Class attendance - 56 hours per semester
  • Preparation for classes - 56 hours per semester
  • Semestral paper - 20 hours per semester
  • Preparation for exam - 20 hours per semester
Learning outcomes
The aim of the course is to deepen students' knowledge in the field of data storage techniques and data processing. The course focuses on big data processing techniques and, data storage in non-relational databases and data analyses and mining.
Knowledge of advanced architectures and methods for data processing.
Prerequisites
Knowledge of relational databases and basic knowledge of query and programming languages.

Assessment methods and criteria
Oral examination

Semestral test: Practical test (data processing and analyses), end of semester (credit week), 2 dates (=><b>max. 2 attempts</b>). Exam: Oral examination with two theoretical topics. The student must answer each question at least satisfactorily.
Recommended literature
  • A. GORELIK. The Enterprise Big Data Lake: Delivering the Promise of Big Data and Data Science, 1st Edition, O'Reilly Media 2019, ISBN: 978-1491931554.
  • C. CHURCHER. Beginning Database Design: From Novice to Professional. 1st Corrected ed., Apress 2007. ISBN: 978-1590597699.
  • J. GRUS. Data Science from Scratch: First Principles with Python, 2nd Edition, O'Reilly Media 2019, ISBN: 978-1492041139.
  • P.-N. TAN, M. STEINBACH, A. KARPATNE, V. KUMAR. Introduction to Data Mining (2nd edition), 2018. ISBN 978-0133128901.


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester