讲解：Newton’s method、R、R、log-likelihoodR|C/C++

Homework # 61. Let f(x) be a function from Rnto R. Suppose we would like tomaximize f(x). Show that if Hf(x) is negative definite thenthe Newton’s method direction at x, [Hf(x)]1f(x), is anascent direction. What does this imply for Newton’s methodwith backtracking? (We did this in class, except that we consideredthe minimization case and Hf(x) as positive definite;here I want you to go through the argument yourself for thisslightly altered case.)2. Consider the log-likelihood for the single covariate (i.e. eachxi ∈ R) logistic regression:log L(α) = XNi=1(1 yi)(α0 α1xi) log(1 + exp(α0 α1xi))(1)(a) Let g(α) = log(1+exp(α0α1xi)). To make the notationsimpler, set xi as follows,(2)and explain why we can rewrite g(x) in the more compactform: g(α) = log(1 + exp(α · xi)). (This compact formmakes it easier to take derivatives.(b) Show the following(4)(You computed the gradient and Hessian of g(α) in previoushws, but here I want you to see the form above so thenext subproblem is easier.)1(c) Let v ∈ Rn. Thinking of v as a column vector, define thematrix A = vvT. Show that A is positive semidefinite,meaning that xTAx ≥ 0 for all x ∈ Rn. (Hint: Consider(xTv)(vT x)). Use this fact to show that the function g(x)is convex.(d) Show the following facts. You can prove them from thedefinition or just explain the intuition through a graph.i. If two functions f(x) and h(x) are convex then so istheir sum f(x) + g(x).ii. If a function f(x) is convex, then f(x) is concave.Then, show that log L(α) is a concave function.(e) Generate a plot of log L(α) over some line in R2that containsthe maximum of log L(α) (you computed this pointin a previous hw.). Explain why the graph you produce isconcave.3. The MNIST dataset is a popular dataset for practicing machinelearning algorithms. Read about the dataset herehttps://代写Newton’s method作业、R编程作业代做、代写R实验作业、代做log-likelihood留学生作业代写en.wikipedia.org/wiki/MNIST_databaseAttached you will find two files. mnist_train.csv, mnist_test.csv.Each file contains a matrix. Each row of the matrix correspondsto an image of a hand written digit. The first entry in the rowis the digit in the image (i.e. 7 if the digit image is a seven), therest of the values, of which there are 784 (from a 28 × 28 pixelimage) are the pixel values. See the script mnist_intro.R foran example.In this problem, you will build a classifier that identifies whena hand written digit equals 3. To build the classifier you willfit a logistic regression to the data. The response variable,y ∈ {0, 1} will be 1 if the number is 3 and 0 otherwise. Thecovariates, x ∈ R784 are the pixel values. Setα = (α0, α1, α2, . . . , α784) (5)The logistic-regression model assumesP(y 1 | x, α) = 11 + exp[α · x](6)2where x is defined as in problem 2 (i.e. we just add a 1 to thebeginning of the x vector.)(a) Show that the log-likelihood is given bylog L(α) = XNi=1(1yi)(α· xi)log(1 + exp(α· xi)) (7)(b) Set g(α) = log(1 + exp(α · xi)) and show that g(x) andHg(x) have the same form given in problem 2(c) Write a damped Newton’s method algorithm to computethe optimal α by maximizing the log likelihood on thetraining dataset. How do you know Newton’s method willconverge to a maximum? (NOTE: You may run into dif-ficulties with non-invertible Hessians; we will address thatissue in coming classes. If that happens, try other startingpoints.)(d) Recall that a classifier is a functionC(x) : R784 → {0, 1}. (8)Once you compute an α in (c), you can build a classifieras followsC(x) = 1 if P(x | α) = 11+exp[α·x] ≥ p0 otherwise.(9)where p is some cutoff probability. Above p, you call theimages as 3’s, below p you call images as not 3’s. Selecta p and test the accuracy of your classifier using the testdataset.转自：http://ass.3daixie.com/2019022347964256.html

讲解：Newton’s method、R、R、log-likelihoodR|C/C++

推荐阅读更多精彩内容