Lecture27

.pdf
STAT 154/254: Online Learning, Bandits and Reinforcement Learning Instructor: Nikita Zhivotovskiy Graduate Student Instructor: Saptarshi Chakraborty These slides are work in progress. Please, report your instructors about typos and errors. Zhivotovskiy STAT 154/254: Online Learning, Bandits and Reinforcement Learning 1 / 1
Online Learning Online learning is a type of machine learning where the model is updated incrementally with each new data point Useful in situations where data is arriving sequentially and cannot be processed in a batch fashion Examples of online algorithms we already know: Perceptron algorithm. SGD (Stochastic Gradient Descent) Zhivotovskiy STAT 154/254: Online Learning, Bandits and Reinforcement Learning 2 / 1
Halving Algorithm Binary classification without noise. The halving algorithm is an online learning algorithm for binary classification problems Assumes there is a finite class F and true labeling y t = f ( x t ) , where f ∈ F and is unknown (and unique). The algorithm proceeds as follows: 1 Initialize the set of candidate functions C = F 2 For each new data point ( x t , y t ) : 1 Make a prediction using majority vote of the classifiers in C 2 Update C by removing classifier that make incorrect predictions on ( x t , y t ) 3 Continue until C contains only one classifier The halving algorithm has a mistake bound of log 2 |F| . Zhivotovskiy STAT 154/254: Online Learning, Bandits and Reinforcement Learning 3 / 1
Page1of 15
Uploaded by ChiefCrownGrouse38 on coursehero.com