course aims in Estonian
Õppeaine eesmärgiks on õpetada andmekaeve problemaatikat ja efektiivseid viise suurandmete analüüsiks.
course aims in English
To teach students about data mining and effective techniques for analyzing big data.
learning outcomes in the course in Est.
Kursuse lõpetanu:
Oskab määratleda ja sõnastada andmekaeve probleeme.
Tunneb klassifikatsiooni, regressioonianalüüsi, klasterdamise ja dimensionaalsuse vähendamise meetodeid.
Oskab valida probleemi jaoks sobiva meetodi.
Oskab hinnata andmekaeve mudeli (tulemi) kvaliteeti.
Oskab teisendada andmeid analüüsiks sobivale kujule.
Oskab kasutada andmekaeve tarkvara.
Tunneb erinevaid andmekaeve näiteprobleeme äri- ja füüsiliste süsteemide vallast.
Tunneb suurte andmemahtude haldamise problemaatikat ja tarkvara.
learning outcomes in the course in Eng.
After completing the course, student:
Can define data mining problems.
Knows the methods of classification, regression, clustering and dimensionality reduction.
Can choose appropriate method for a problem.
Can evaluate the quality of the model.
Can transform data into the form appropriate for data mining.
Can use data mining software.
Is familiar with example problems from the fields of business- and physical systems.
Is familiar with the problems and tools for analyzing big data.
brief description of the course in Estonian
Klassifitseerivad mudelid, mis ennustavad objekti klassi. Regressiooni mudelid, mis ennustavad pidevat väärtust. Sarnaste objektide klasterdamine. Dimensionaalsuse vähendamine andmetes. Mudelite valimine ja hindamine. Andmete eeltöötlus. Näiteprobleemid äri- ja füüsiliste süsteemide vallast. Suurte andmemahtude haldamine. Teadusliku Pythoni (SciPy stack) kasutamine andmekaeves.
brief description of the course in English
Classification models predicting the class of an object. Regression models for predicting a continuous-valued variable. Clustering similar objects. Dimensionality reduction for multi-variate data. Choosing and evaluating models. Data pre-processing. Example problems from the fields of business- and physical systems. Managing big data. Using Scientific Python stack for data mining.
type of assessment in Estonian
Hinne pannakse semestri töö (50%) ja iseseisva andmekaeve projekti (50%) alusel.
type of assessment in English
Points for the final grade come from semesters work (50%) and independent data mining project (50%).
independent study in Estonian
Iseseisev töö hõlmab omavalitud andmestiku analüüsi. Tulemused tuleb esitada IPythoni märkmikuna (notebook).
independent study in English
Independent project is a data mining project for a data chosen by the student. Results are presented as IPython notebook.
study literature
Aine koduleht: https://moodle.taltech.ee/course/view.php?id=30058
Raschka, Sebastian. Python machine learning. Birmingham, UK: Packt Publishing, 2015.
Hauck, T. Scikit-learn Cookbook. Birmingham, U.K.: Packt Publishing, 2014.
Witten, Ian H.; Frank, Eibe; Hall, Mark A. Data Mining: Practical Machine Learning Tools and Techniques (3 ed.). Elsevier, 2011.
study forms and load
daytime study: weekly hours
4.0
session-based study work load (in a semester):