OCO
RegretT(A)=sup[T∑t=1ft(xAt)−minxT∑t=1ft(x)]
Theorem 1.2 Let ϵ∈(0,0.5). Suppose that the best expert makes L mistakes. Then:
at={A,Wt(A)≥Wt(B)B,otherwise
Wt+1(i)={Wt(i),if expert i was correctWt(i)(1−ϵ),if expert i was wrong
Wt+1(i)=Wt(i)e−ϵℓt(i)