Last updated
Kurskode
dmmlr
Varighet
14 timer (vanligvis 2 dag inkludert pauser)
Krav
This course is part of the Data Scientist skill set (Domain: Analytical Techniques and Methods)
Oversikt
R er et open-source gratis programmeringsspråk for statistisk databehandling, dataanalyse og grafikk. Forskning brukes av et økende antall ledere og dataanalytikere innenfor bedrifter og akademi. R har et bredt spekter av pakker for data mining.
Machine Translated
Kursplan
Introduction to Data mining and Machine Learning
- Statistical learning vs. Machine learning
- Iteration and evaluation
- Bias-Variance trade-off
Regression
- Linear regression
- Generalizations and Nonlinearity
- Exercises
Classification
- Bayesian refresher
- Naive Bayes
- Dicriminant analysis
- Logistic regression
- K-Nearest neighbors
- Support Vector Machines
- Neural networks
- Decision trees
- Exercises
Cross-validation and Resampling
- Cross-validation approaches
- Bootstrap
- Exercises
Unsupervised Learning
- K-means clustering
- Examples
- Challenges of unsupervised learning and beyond K-means
Advanced topics
- Ensemble models
- Mixed models
- Boosting
- Examples
Multidimensional reduction
- Factor Analysis
- Principal Component Analysis
- Examples