Cost Function
Model Representation
m = Number of training examples
x's = "input" variable / features
y's = "output" variable / "targer" variable
h instand for hypothesis h represent a function
How do we represent h?
Cost Function
- minimize
- 代价函数(平方误差函数,平方误差代价函数)解决回归问题最好的手段
Hypothesis:
Cost Function:
Goal:
Cost Function
Simplified:
对于线性回归来说
- 一维的代价函数是一个弓形
- 三维的代价函数和一维的代价函数一样都是一个弓形
contour plot or contour figure:轮廓图
Gradient descent
Gradient descent can be used in more common cost function ,not only in two parameter
Outline:
Start with some
Keep changing to reduce J()
Until we hopefully end up at a minimum
Gradient descent algorithm:
repeta until convergence {(for =0 and =1)}
is a mumber called learning rate, which control the length of our step in Gradient descent.
Correct: Simultaneous update
temp0:=
temp1:=
Incorrect:
temp0:=
temp1:=
同时更新
Gradient descent's characteristics
if is too small, gradient descent can be slow.
if is too large, gradient descent can overshoot the minimum. It ma fail to coverge ,or even diverge.
As we approach a local minimum, gradient descent will automatically take smaller steps. So, no need to decraser over time.
Gradient Descent For Liner Reg
convex function
"Batch" Gradient Descent: Each step of gradient descent uses all the training examples.