Data Mining and Big Data Applications
BASIC DATA
course listing
A - main register
course code
EVM0520
course title in Estonian
Andmekaeve ja suurandmete rakendused
course title in English
Data Mining and Big Data Applications
course volume CP
-
ECTS credits
6.00
to be declared
yes
fully online course
not
assessment form
Graded assessment
teaching semester
autumn - spring
language of instruction
Estonian
English
Study programmes that contain the course
code of the study programme version
course compulsory
RATM24/25
no
Structural units teaching the course
EV - Virumaa College
Course description link
Timetable link
View the timetable
Version:
VERSION SPECIFIC DATA
course aims in Estonian
Õppeaine eesmärk on:
- luua eeldused tööks suurte andmehulkadega;
- anda ülevaade kesksete masinõppe algoritmide tööpõhimõtetest;
- valmistada ette masinõppe mudelite rakendamiseks reaalses tööstuses;
course aims in English
The aim of this course is to:
- create prerequisites for working with big data;
- provide an overview of the working principles of central machine learning algorithms;
- prepare for the implementation of machine learning models in real Industry.
learning outcomes in the course in Est.
Õppeaine läbinud üliõpilane:
- lahendab tööstuse kompleksseid probleeme masinõppe ja andmekaeve meetoditega ning demonstreerib erinevate faktorite mõju pakutud lahendusele;
- rakendab tööstuse ja muid andmeid uudse ressursina, mille abil optimeerida tootmisprotsesse või ressursside juhtimist;
- hindab enda ja teiste arenguvajadusi tööstusandmete kasutamisel ning toetab teiste õppimist õpetades, juhendades ja/või muul viisil.
learning outcomes in the course in Eng.
After completing this course the student:
- solves complex industrial problems using machine learning and data mining methods and demonstrates the impact of various factors on the proposed solution;
- applies industrial and other data as a novel resource to optimize production processes or resource management;
- assesses own and others' development needs in the use of industrial data and supports the learning of others by teaching, mentoring and/or otherwise.
brief description of the course in Estonian
Kursus keskendub erinevate masinõppe ja andmekaeve algoritmide teoreetilisele tutvustusele ning praktilisele rakendamisele.
Käsitletakse:
- CRISP-DM: CRoss Industry Standard Process for Data Mining
- ennustamist ajaseeriate abil, ajaseeriate andmete ettevalmistamisest kuni mudelite genereerimiseni ja testimiseni;
- pildilise informatsiooni ettevalmistamist masinõppeks ja ennustamiseks, sealhulgas materjalide tuvastamisel levinud pildi filtrite kasutamist kui ka objektide tuvastamist;
- soovituste süsteemide loomist ja kasutaja profileerimise tehnikaid;
- suurte keelemudelite tööpõhimõtet ja sõnavektorite treenimist;
- erinevate masinõppe algoritmide optimeerimise viise;
- ansambelõpet;
- levinud probleeme andmete ettevalmistamisel ja ennustustulemuste tõlgendamisel.
brief description of the course in English
The course focuses on the theoretical introduction and practical application of various machine learning and data mining algorithms.
It covers:
- CRISP-DM: CRoss Industry Standard Process for Data Mining
- forecasting using time series, from preparing time series data to generating and testing models;
- preparing image information for machine learning and forecasting, including the use of common image filters for material recognition as well as object recognition;
- creating recommendation systems and user profiling techniques;
- the working principle of large language models and training word vectors;
- ways of optimizing various machine learning algorithms;
- ensemble learning;
- common problems in data preparation and interpreting forecast results.
type of assessment in Estonian
Kursusel tehakse praktilised harjutused peamiste teemade kohta, mis annavad kokku kuni 50% lõpphindest ning lõpuprojekt (kuni 50%) ühel vabalt valitud teemal, mis sisaldab enda valitud või kogutud andmete eeltöötlust, treenimist, valideerimist ja rakendust ning selle protsessi raportit teadusartikli formaadis ning ettekannet viimasel kohtumisel toimuval lõpuseminaris.
type of assessment in English
The course includes practical exercises on the main topics, which together account for up to 50% of the final grade, and a final project (up to 50%) on a freely chosen topic, which includes preprocessing, training, validation and application of self-selected or collected data, a report of this process in the format of a scientific article, and a presentation at the final seminar held at the last meeting.
independent study in Estonian
-
independent study in English
-
study literature
Raschka, Sebastian. Python machine learning. Birmingham, UK: Packt Publishing, 2015.
Hauck, T. Scikit-learn Cookbook. Birmingham, U.K.: Packt Publishing, 2014.
Witten, Ian H.; Frank, Eibe; Hall, Mark A. Data Mining: Practical Machine Learning Tools and Techniques (3 ed.). Elsevier, 2011.
Lisamaterjal avaldatakse kursuse jooksul Moodle vahendusel.
study forms and load
daytime study: weekly hours
4.0
session-based study work load (in a semester):
lectures
2.0
lectures
16.0
practices
2.0
practices
16.0
exercises
0.0
exercises
0.0
lecturer in charge
-
LECTURER SYLLABUS INFO
semester of studies
teaching lecturer / unit
language of instruction
Extended syllabus
2025/2026 autumn
Avar Pentel, EV - Virumaa College
Estonian
    Course description in Estonian
    Course description in English