Statistical Learning - A hand-write note


The ten major algorithms of statistical learning methods have been simplified and handwritten out (although I think the book itself is already quite concise). Now there is only the process of the algorithms themselves; in the future, if I have any new understandings, I will supplement them. The writing is ugly, even I can't bear to look at it, so I post it purely as a backup

Abstract

i0HLjK.jpg
i0Hb1x.jpg
i0HoN9.jpg

Perceptron

i0HH91.jpg
i0HThR.jpg

k-Nearest Neighbors

i0Hqc6.jpg
i0HXnO.jpg

Naive Bayes

i0HjBD.jpg

Decision Tree

  • GBDT is written in the improvement method, and it is also possible to expand and look at random forests, which is an bootstrap method that utilizes decision trees. i0HzAH.jpg
i0HvHe.jpg
i0bP3t.jpg
i0bSNd.jpg

Logistic regression, maximum entropy

  • The mutual derivation between maximum entropy and logistic regression is to be supplemented i0bigP.jpg
i0bp4A.jpg

Support Vector Machine

i0bFjf.jpg
i0bAu8.jpg
i0bEDS.jpg
i0beEQ.jpg
i0bVHg.jpg
i0bmNj.jpg

Enhancement Methods

  • XGBoost i0bn4s.jpg
i0bKCn.jpg

Expectation-Maximization (EM) algorithm

i0bM3q.jpg
i0bQg0.jpg
  • When performing inference on Gaussian mixture models using the EM algorithm, the parameters to be inferred include the mean, variance, and proportion coefficient of k Gaussian models. The latent variable represents the probability that the jth observation sample belongs to the kth Gaussian model, referred to as responsibility, while \(n_k\) is the sum of the responsibility of the kth Gaussian model over all samples, divided by \(N\) , which is used to update the GMM proportion coefficient with its mean. The mean is updated using responsibility-weighted samples, and the variance is updated in a similar manner.
  • After updating the parameters, recalculate responsibility with these parameters, recalculate the E-step, and then proceed to the M-step, thereby completing the iteration.

Hidden Markov

i0b3uT.jpg
i0blvV.jpg
i0b8DU.jpg
i0bGbF.jpg

Conditional Random Field

  • Address the solutions for three types of problems, as conditional random fields are the conditional extension of the hidden Markov model, and the algorithms are similar i0bYE4.jpg