讲解:Economics 4P05、Statistical learning、Python,Java,c/c++Proce

Statistical learningDepartment of EconomicsBrock University1 Assignment 21.1 Conceptual questions1. Suppose that we wish to predict whether a given stock will issue a dividend this year(“Yes” or “No”) based on X, last year’s percent profit.We examine a large number ofcompanies and discover that the mean value of X for companies that issued a dividendwas 10, while the mean for those that didn’t was 0. In addition, the variance of X for thesetwo sets of companies was 36. Finally, 80% of companies issued dividends. Assumingthat X follows a normal distribution, predict the probability that a company will issue adividend this year given that its percentage profit was X = 4 last year. Use equation (1)from your notes on classification.• This problem has to do with odds. On average, what fraction of people with an odds of0.37 of defaulting on their credit card payment will in fact default?1.2 Classification methodsThis question should be answered using the Weekly data set, which is part of the ISLR package.This data is similar in nature to the Smarket data except that it contains 1, 089 weekly returnsfor 21 years, from the beginning of 1990 to the end of 2010.1. Produce some numerical and graphical summaries of the Weekly data. Do there appearto be any patterns? For the numerical summaries focus on the means of the returns(today and all lags) as well as on the correlation between today’s returns and the lags.For the graphical summaries create a plot of today’s return versus its first lag and discuss.2. Use the full data set to perform a logistic regression with Direction as the response andthe five lag variables plus Volume as predictors. Use the summary function to print theresults. Do any of the predictors appear to be statistically significant? If so, which ones?Compute the predicted probabilities and obtain the following features: min, max, mean.Discuss those features.3. Compute the confusion matrix and overall fraction of incorrect predictions. Explainwhat the confusion matrix is telling you about the types of mistakes made by the logisticregression.14. Use the full data set to perform a LPM regression with Direction as the response andthe five lag variables plus Volume as predictors. Use the summary function to printthe results. Do any of the predictors appear to be statistically significant? If so, whichones? Compute the predicted probabilities and obtain the following features: min, max,mean. Discuss those features. Are the LPM probs sensible? Are they similar to those ofthe logistic regression? Do you expect the confusion matrix to be similar to that of thelogistic regression?5. Compute the confusion matrix and overall fraction of incorrect predictions for this LPM.Is the matrix similar to the one obtained with the logistic regression?6. Now fi代写Economics 4P05作业、代做Statistical learning作业、Python,Java,c/c+t the logistic regression model using a training data period from 1990 to 2008, withLag2 as the only predictor. Compute the confusion matrix and the overall fraction ofincorrect predictions for the held out data (that is, the data from 2009 and 2010).7. Repeat (6) using LDA.8. Repeat (6) using KNN with K = 1.9. Which of these methods (logistic, LDA or KNN) appears to provide the best results onthis data? Why?1.3 Cross-validationIn this question you will use the glm() and predict() functions, and a for loop to compute theLOOCV error for a simple logistic regression model on the Weekly data set.1. Fit a logistic regression model that predicts Direction using Lag1 and Lag2.2. Fit a logistic regression model that predicts Direction using Lag1 and Lag2 using all butthe first observation.3. Use the model from (2) to predict the direction of the first observation. You can do this bypredicting that the first observation will go up if P(Direction = ”U p”|Lag1, Lag2) > 0.5.Was this observation correctly classified?4. Write a for loop from i = 1 to i = n, where n is the number of observations in the dataset, that performs each of the following steps:i. Fit a logistic regression model using all but the ith observation to predict Directionusing Lag1 and Lag2.ii. Compute the posterior probability of the market moving up for the ith observation.iii. Use the posterior probability for the ith observation in order to predict whether ornot the market moves up.iv. Determine whether or not an error was made in predicting the direction for the ithobservation. If an error was made, then indicate this as a 1, and otherwise indicateit as a 0.5. Take the average of the n numbers obtained in (4)iv in order to obtain the LOOCVestimate for the test error. Comment on the results.2Notes:• Have a look at the Course Outline (on Sakai) for more info on how to create tables.• The report must be typed.• The report should have a titlepage, be single space and typed using a font of size 12.• Your computer code and output should be included in the appendix.• Pay attention to your graphs.• Descriptive statistics, when applicable, should be reported in a table.• Regression results should also be presented in a Table. The first column of your tablewould contain the list of independent variables (starting with the constant). The remainingcolumns would contain the results for the different models. The last few rows of thetable should contain: the sample size, and 2 measures of goodness of fit.• When using a test statistic, report the null being testing, the formula for the test statisticand how it was computed (eg using a regression and if so which regression). Make sureto report a conclusion for that test (eg, I reject the null because XXXX and this impliesthat XXXX).3转自:http://www.daixie0.com/contents/3/4413.html

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • The Inner Game of Tennis W Timothy Gallwey Jonathan Cape ...
    网事_79a3阅读 12,400评论 3 20
  • 重庆的六月中旬,天空与住年不太一样,晴天变雨天,窝家听雨声,有点烦。哪儿都不想去,只和手机作伴。 朋友圈刷了又刷,...
    赖维书阅读 331评论 0 0
  • 关键概念 随机数我们在软件中一般使用的随机数实际上是伪随机数,具有统计学伪随机性。统计学伪随机性指的是在给定的随机...
    绝露阅读 1,055评论 0 2
  • 周延要出国的消息,来得有点突然。 那是兵荒马乱的高三,我和大多数人一样,整天顶着黑眼圈,埋头于题海战术。有天课间,...
    小蔡_c8ee阅读 350评论 0 1
  • 读康永和陈文茜的对话,那种汩汩的营养不断地涌进来,滋润了我内心好多困惑的涸点。豁然开朗了。 一、 每当被“自我”的...
    禾人爱阅读 256评论 0 1