The Azimuth Project
Blog - Exploring regression on the El Niño data (Rev #4, changes)

Showing changes from revision #3 to #4: Added | Removed | Changed

Regression using an l 1/2l_{1/2} prior

We can write the regression cost function—adding an l 1/2l_{1/2} prior—using explicit summations as

costcost 2= e=1 E( i=1 PM i (e)x iy (e)) 2+λ i=1 P|x i||x i| 2 cost cost_2 = \sum_{e=1}^{E} \left( \sum_{i=1}^P M^{(e)}_{i} x_{i} \right)^2 - y^{(e)}\right)^2 + \lambda \sum_{i=1}^P \sqrt{|x_i|} |x_i|^2
cost 1= e=1 E( i=1 PM i (e)x iy (e)) 2+λ i=1 P|x i| cost_1 = \sum_{e=1}^{E} \left( \sum_{i=1}^P M^{(e)}_{i} x_{i} - y^{(e)}\right)^2 + \lambda \sum_{i=1}^P |x_i|
cost 1/2= e=1 E( i=1 PM i (e)x iy (e)) 2+λ i=1 P|x i| cost_{1/2} = \sum_{e=1}^{E} \left( \sum_{i=1}^P M^{(e)}_{i} x_{i} - y^{(e)}\right)^2 + \lambda \sum_{i=1}^P \sqrt{|x_i|}

Regression regularization paths

If we restrict to one variable from the numerous vectors and denote this variable by xx, we get

costx=Ax+B+λsgn(x)2|x|=0 \frac{\partial cost}{\partial x} = A x + B + \frac{\lambda sgn(x)}{2\sqrt{|x|}} = 0

where AA and BB don’t depend on xx. If we denote λsgn(x)/2\lambda sgn(x)/2 by CC, we can multiply through by y=|x|y =\sqrt{|x|} to find the minimum (along this co-ordinate) is

±Ay 3+By+C=0 \pm A y^3 + B y + C = 0

where the ±\pm depends whether xx is positive/negative and all subject to needing to ensure the solutions in yy are also consistent with the original equation. Since this is a cubic equation we have a simple closed form for the solutions to this equation and hence can efficiently solve the original equation.