Have some function \(J(\theta_0,\theta_1)\)
Want \(\begin{matrix}
min\\
\theta_0,\theta_1
\end{matrix}\) \(J(\theta_0,\theta_1)\)
Outline:
- Start with some \(\theta_0, \theta_1\)
- Keep changing \(\theta_0, \theta_1\) to reduce \(J(\theta_0,\theta_1)\) until we hopefully end up at a minimum
Gradient descent algorithm
repeat until convergence {
\(\theta_j := \theta_j - \alpha\frac{\partial}{\partial \theta_j}J(\theta_0,\theta_1)\) \((for\ j = 0\ and\ j = 1\))
}
Correct: Simultaneous update
tmp0 \(:= \theta_0 - \alpha\frac{\partial}{\partial \theta_0}J(\theta_0,\theta_1)\)
tmp1 \(:= \theta_1 - \alpha\frac{\partial}{\partial \theta_1}J(\theta_0,\theta_1)\)
\(\theta_0 :=\) temp0
\(\theta_1 :=\) temp1
Incorrect:
tmp0 \(:= \theta_0 - \alpha\frac{\partial}{\partial \theta_0}J(\theta_0,\theta_1)\)
\(\theta_0 :=\) temp0
tmp1 \(:= \theta_1 - \alpha\frac{\partial}{\partial \theta_1}J(\theta_0,\theta_1)\)
\(\theta_1 :=\) temp1