逻辑回归预测学生是否会被大学录取。
You want to determine each applicant’s chance of admission based on their results on two exams. You have historical data from previous applicants that you can use as a training set for logistic regression. For each training example, you have the applicant’s scores on two exams and the admissions decision.
ex2data1.txt(学生2次考试成绩,是否录取)
34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
60.18259938620976,86.30855209546826,1
79.0327360507101,75.3443764369103,1
45.08327747668339,56.3163717815305,0
……
Part 1: Plotting data
%% ==================== Part 1: Plotting data ====================
fprintf(['Plotting data with + indicating (y = 1) examples and o ' ...
'indicating (y = 0) examples.\n']);
plotData(X, y);
fprintf('\nProgram paused. Press enter to continue.\n');
pause;
plotData 函数
function plotData(X, y)
% Create New Figure
figure;
hold on;
% Find Indices of Positive and Negative Examples
pos = find(y == 1); % 返回 y=1 的行号 组成 行号列向量
neg = find(y == 0);
% Plot Examples
plot(X(pos, 1), X(pos, 2), 'k+','LineWidth', 2, 'MarkerSize', 7); % 2维上的点
plot(X(neg, 1), X(neg, 2), 'ko', 'MarkerFaceColor', 'y','MarkerSize', 7);
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')
legend('Admitted', 'Not admitted') % 图例
hold off;
end
Part 2: Compute Cost and Gradient
%% ============ Part 2: Compute Cost and Gradient ============
% In this part of the exercise, you will implement the cost and gradient
% for logistic regression. You neeed to complete the code in
% costFunction.m
[m, n] = size(X);
% Add intercept term to x and X_test
X = [ones(m, 1) X];
% Initialize fitting parameters
initial_theta = zeros(n + 1, 1); % theta n+1 维
% Compute and display initial cost and gradient
[cost, grad] = costFunction(initial_theta, X, y);
fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
fprintf('\nProgram paused. Press enter to continue.\n');
pause;
Cost at initial theta (zeros): 0.693147
Gradient at initial theta (zeros):
-0.100000
-12.009217
-11.262842
Program paused. Press enter to continue.
sigmoid 函数定义
function g = sigmoid(z)
g = 1 ./ ( 1 + exp(-z) ); % 传入的z可能是向量,所以用 ./
end
costFunction 求 J 和 grad 梯度算子
function [J, grad] = costFunction(theta, X, y)
% Initialize some useful values
m = length(y);
% h(x) = g(θ'x) = sigmoid(X*theta) (g即sigmod函数)
h = sigmoid(X*theta); % 预测为1概率 (m*1) (1-h:预测为0概率)
% J = -1/m * ∑ ( y*log(h(x)) + (1-y)*log(1-h(x)) )
J= -1/m * sum(y .* log(h) + (1-y) .* log(1-h)); % .* 表示元素相乘,两边都是(m*1)
grad = (X' * (h - y)) / m; % 调用公式
end
假设函数
损失函数
梯度链式求导,第 2 步用到了 sigmoid 函数求导
sigmoid 函数求导
Part 3: Optimizing using fminunc
%% ============= Part 3: Optimizing using fminunc =============
% In this exercise, you will use a built-in function (fminunc) to find the
% optimal parameters theta.
% Set options for fminunc
% ‘GradObj’, ‘on’:告诉fminunc在costFunction函数中定义了grad,minimize的时候可以利用grad
% ‘MaxIter’, ‘400’:至多循环400次
options = optimset('GradObj', 'on', 'MaxIter', 400);
% Run fminunc to obtain the optimal theta
% @(t)(costFunction(t, X, y)) 表示要最小化的函数
% initial_theta 从 initial_theta 开始得到最优的 theta
[theta, cost] = fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);
% Print theta to screen
fprintf('Cost at theta found by fminunc: %f\n', cost);
fprintf('theta: \n');
fprintf(' %f \n', theta);
% Plot Boundary 做边界线,画直线
plotDecisionBoundary(theta, X, y);
% Put some labels
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')
% Specified in plot order
legend('Admitted', 'Not admitted')
hold off;
fprintf('\nProgram paused. Press enter to continue.\n');
pause;
- 'GradObj', 'on': tells fminunc that our function returns both the cost and the gradient. This allows fminunc to use the gradient when minimizing the function.
- 指定要最小化的函数 @(t) ( costFunction(t, X, y) ). This creates a function, with argument t, which calls your costFunction. This allows us to wrap the costFunction for use with fminunc.
If you have completed the costFunction correctly, fminunc will converge on the right optimization parameters and return the final values of the cost and θ.
Notice that by using fminunc, you did not have to write any loops yourself, or set a learning rate like you did for gradient descent. This is all done by fminunc, you only needed to provide a function calculating the cost and the gradient.