Course notes for STA 325 -- Data Mining and Machine Learning
The course notes for STA 325 -- Data Mining and Machine Learning can be found below along with the corresponding reading that goes with this material. Homeworks will be assigned in class.
- Intro to Machine Learning and Review of R. (Read Ch 1 of ISLR).
- Introduction to Statistical Computing in R. (This is Supplemental to ISLR). We will cover a review of R programming. In addition we will cover: functional programming and mining textual data.
- Introduction to Statistical Machine Learning and Notation (Chapters 1 -- 2 of ISLR)
- Information Retrieval (Read Chapters 1--3 of Mining Massive Databases).
- Advanced Information Retrieval -- Locality Sensitive Hashing (Read Chapters 1-- 3 of Mining Massive Databases).
- Unsupervised Learning (Chapter 10 ISLR). PCA, Kmeans clustering, Hierarchical clustering, and evaluation metrics.
- Intro to the Book (Chapters 1–2 of ISLR).
- Linear regression, multivariate regression, and k-nearest neighbors regression. (Chapter 3 of ISLR).
- Classification: Logistic regression, LDA, and QDA. (Chapter 4 of ISLR).
- Resampling: Bootstrapping and Cross validation. (Chapter 5 of ISLR).
- Classification and Regression Trees (CART). (Chapter 8 of ISLR).
- Bagging, Boosting, and Random Forests (Chapter 8 of ISLR).
Class structure: Class content will follow the lecture notes/slides. The Monday lab component is an optional review period each week taught by the class TA. Any missed classes will also be held during this time. Classes will be held on Tuesday/Thursdays. Please see the syllabus for full details regarding the class schedule, policies, exams, office hours, and other required components of the course.