讲解:R、R、R、R、R、R

IMPORTANT INSTRUCTIONSAll homework pages (except the top sheet) must be stapled (before you come to class).The rst page (‘top sheet’) should contain ONLY your name, student ID, discussion section,and homework number. Use the format shown below. Do NOT staple to the rest of yourhomework.The second page (which means a new sheet of paper, so not the back side of the rst page)should ALSO contain your name, student ID, discussion section, and homework number.Use the format shown below.After I collect homeworks I put all of the top sheets into a folder before passing the home-works to the grader. If at any point during the quarter there is a homework that you knowyou turned in, but it does not show on Canvas, contact me. I will look to see if I have yourtop sheet from the homework. If I have your top sheet then I will give you credit for thehomework.It is your responsibility to make sure every homework assignment you submit has a top sheetwith your correct discussion section number. If you tell me you turned in a homework, butthere is no Canvas grade and no top sheet in my folder then you will get a 0.You will not loose any point for not making a top sheet. But if your homework goes missingyou will have no way to prove you turned it in.On both the rst page (‘top sheet’) and second page write your name and student ID on thetop left, homework number on the top center, and section on the top right. For example,for homework 6 if your name is John Smith, your student ID is 123456789, and you are insection A01, then the top of your top sheet and rst stapled page should look like thisJohn Smith Homework 6 A01123456789Points lost if youdon’t follow the rulecorrect format for name, ID, homework and section number 1Staple all pages EXCEPT the top sheet 1If your homework is on paper pulled out of a notebook,cut o all of the fringes (from the torn horizontal threadsthat attached the paper to the notebook). 1Please do not turn in your R code with your homework.Be kind to the grader.make sure you write your name clearly (so it is easy to read)write neatly2We will use the same data from homework 5. (next part is exactly what is in homework 5,but here I add a third and fourth model)The data are from a hypothetical experimental study similar to homework 3 which examinesthe relationship between 5 doses of a cholesterol lowering drug and reduction in serumcholesterol, but with an additional categorical predictor variable exercise. There are threedi erent types of exercise: walk, bike, and jog.predictor variables1. drug dose: (50, 55, 60, 65, and 70)2. exercise: (1=walk, 2=bike, 3=jog)outcome: cholesterol reduction(There are some negative y values which means some subjects had an increase in cholesterol.)This is from a randomized experiment with two replications for each combination of doseand exercise. This means the predictor variables are balanced, which means that dose andexercise are independent.De ne notationd = dosea = exercise (1=walk, 2=bike, 3=jog) (a for \activity&", since ei is already used for theresiduals)var(Yjd) = 2Yjd = the variance of Y conditional on dose dvar(Yja) = 2Yja = the variance of Y conditional on exercise avar(Yjd;a) = 2Yjd;a the variance of Y conditional on dose d and exercise amodel 1: predictor: dose (no exercise e ect)Yi = 0 + 1xi1 +&"i (1)xi1 = dose for subject i&"i N 0; 2Yjd model 2: dose + exercise e ects with walk as the baseline groupYi = 0 + 1xi1 + 2xi2 + 3xi3 +&"i (2)xi1 = dose for subject ixi2 =(1 if subject i in biking group0 otherwisexi3 =(1 if subject i in jogging group0 otherwise&"i N 0; 2Yjd;a 3model 3: dose + exercise e ects with jog as the baseline groupYi = 0 + 1xi1 + 2xi2 + 3xi3 +&"i (3)xi1 = dose for subject ixi2 =(1 if subject i in walking group0 otherwisexi3 =(1 if subject i in biking group0 otherwise&"i N 0; 2Yjd;a model 4: dose + exercise e ects with means parameterizationYi = 1xi1 + 2xi2 + 3xi3 + 3xi4 +&"i (4)xi1 = dose for subject ixi2 =(1 if subject i in walking group0 otherwisexi3 =(1 if subject i in biking group0 otherwisexi4 =(1 if subject i in jogging group0 otherwise&"i N 0; 2Yjd;a The interpretation of the parameters in model 4 are1 = dose slope2 = E(Yjwalk and dose = 0)3 = E(Yjbike and dose = 0)4 = E(Yjjog and dose = 0)You can run model 4 in R using the following commandsa1=as.numeric(exercise==1)a2=as.numeric(exercise==2)a3=as.numeric(exercise==3)summary(lm(y ~ -1+dose+a1+a2+a3))The -1 in the command takes the ‘intercept’ out of the model.Models 2, 3, and 4 are di erent parameterizations of the same model. We refer to twodi erent models as being the same model if they have the same expected values for everycombination of the predictor variable, which also means they have the same predicted valuesfor every observation in the dataset.4In homework 5 you estimated the parameters in model 2 which uses walking as thebaseline group. The results were> summary(lm(y~dose+a2+a3))Coefficients:Estimate Std. Error t value Pr(>|t|)(Intercept) -67.8000 33.4009 -2.030 0.05272 .dose 1.2033 0.5454 2.206 0.03641 *a2 29.6000 9.4472 3.133 0.00425 **a3 34.3000 9.4472 3.631 0.00122 **---Residual standard error: 21.12 on 26 degrees of freedomMultiple R-squared: 0.4392,Adjusted R-squared: 0.3745F-statistic: 6.788 on 3 and 26 DF, p-value: 0.00157Notes1. The \Residual standard error&" is the square root of the MSE.2. \The F-statistic: 6.788 on 3 and 26 DF, p-value: 0.00157&" is a model comparison Ftest where the full model is model 2 and the reduced model is a null model with nopredictor variables. So the reduced model isYi = 0 +&"i&"i N 0; 2Y Note that var(&"i)=var(Y) which is a marginal variance (meaning it is not a conditionalvariance because it is not conditional on anything.)I usually call this the null model because it is the most reduced model possible.TheF = 6:788 is simultaneously testing if dose and/or exercise are signi cantly relatedto cholesterol reduction. And if we had additional predictor variables in the model,this would be testing if any of the predictors were signi cantly related to cholesterolreduction. Once we have more than a few predictors in a regression model it willusually be the case that at least some of the predictors are signi cant. And we usuallywant to examine each predictor variable separately. So the F statistic you get in theprintout from R is a model comparison that is often not of interest. However, if thereis only one predictor variable in the model then this F statistic is testing if the onepredictor variable is a signi cant predictor of cholesterol.5Beginning of questions1. Create three indicator variables (same as you did in homework 5) using the followingR code.a1=as.numeric(exercise==1)a2=as.numeric(exercise==2)a3=as.numeric(exercise==3)No answer is required for this question.2. What is the interpretation of the following parameters in model 3? Give the inter-pretation both in words (eg., it is the mean or intercept for walking or the di erencebetween walking and biking, etc.) and in terms of the conditional expected values;That is some function of E(Yjd = 50;a = 1);:::;E(Yjd = 70;a = 3)(a) 0(b) 1(c) 2(d) 33. Use the values of the least squares estimates of the parameters in model 2 (given aboveon page 4) to nd the values of the least square代做留学生R语言、R编程代写、调试R作业、R编程代写、调试R作业、R作业调试s estimates of the parameters in model3. You can verify your answers are correct if you like by using R to get the estimatesof the parameters for model 3.4. Models 2, 3, and 4 are three di erent parameterizations of the same model. You cansee that models 2, 3, and 4 are the same by checking that they give the same predicted(also called \ tted&") values.m1=lm(y~dose+a2+a3)m2=lm(y~dose+a1+a2)m3=lm(y~dose-1+a1+a2+a3)cbind(m1$fitted,m2$fitted,m3$fitted)No answer is required for this question.5. Using either your parameter estimates from either model 2 or from model 3, provideestimates of the following.(a) E(Yjwalk;dose=0)(b) E(Yjbike;dose=0)(c) E(Yjjog;dose=0)6. Check your answers to the previous question by running model 4 (with means param-eterization).6summary(lm(y~dose-1+a1+a2+a3))No answer is required for this question.7. Using either your parameter estimates from either model 2 or from model 3, provideestimates of the following.(a) E(Yjwalk;dose=55)(b) E(Yjbike;dose=55)(c) E(Yjjog;dose=55)8. You can check your answers to the previous question by using a model parameterizationwhere the parameters in the model are equal to the expected values at dose 55.Yi = 1(xi1 55) + 2xi2 + 3xi3 + 3xi4 +&"ixi1 = dose for subject ixi2 =(1 if subject i in walking group0 otherwisexi3 =(1 if subject i in biking group0 otherwisexi4 =(1 if subject i in jogging group0 otherwise&"i N 0; 2Yjd;a Run using R codedoseminus55=dose-55summary(lm(y~doseminus55-1+a1+a2+a3))By replacing xi1 in model 4 with (xi1 55) we get the 2, 3, and 4 are the expectedvalues at dose 55. What we are doing is replacing the variable dose with a new variablecalled doseminus55. Then 2, 3, and 4 are the expected values when doseminus55=0,which corresponds to dose=55.No answer is required for this question.9. Suppose we reparameterize model 4 replacing the dose variable with dose c where c7is some constant. So the model isYi = 1(xi1 c) + 2xi2 + 3xi3 + 3xi4 +&"ixi1 = dose for subject ixi2 =(1 if subject i in walking group0 otherwisexi3 =(1 if subject i in biking group0 otherwisexi4 =(1 if subject i in jogging group0 otherwise&"i N 0; 2Yjd;a What value of c will result in the smallest standard errors of the estimates of 2, 3,and 4? Hints: You can answer this question by thinking about what expected valuesare being estimated by b2, b3, and b4 and how that depends on the value c. Recall whenwe had simple linear regression with only one continuous x variable how the varianceof the predicted value (the estimate of the expected value) depended on the value ofx.10. Will you get the same or di erent values for the MSE for models 2, 3, and 4? You cananswer this question by actually calculating the MSE for the two models. However,since you might see this type of question on an exam, I suggest you try to rst answerthe question without doing any calculations.11. Conduct a hypothesis test using an F statistic to compare models 1 and 2. This isequivalent to testing if there is an exercise e ect. The null and alternative hypothesiscan be written in several di erent ways, includingH0 : E(Yjd;a = 1) = E(Yjd;a = 2) = E(Yjd;a = 3)Ha : H0 not trueNote that it is not necessary to specify a value for d because we are assuming there isno interaction which means that the di erences between E(Yjd;a = 1), E(Yjd;a = 2),and E(Yjd;a = 3) do not depend on the value of d.(a) Write the null hypothesis H0 as a statement about the parameters 0; 1; 2; 3in model 2.(b) What is the value of the test statistic F ?(c) What is the p-value?(d) Does the data provide evidence for Ha? How did you make your decision?12. Suppose we conduct a model comparison F test to compare models 1 and 4. This isthe same null hypothesis as in question 11. Write the null hypothesis as a statementabout the parameters 1; 2; 3; 4 in model 4.813. Suppose we want to test the null hypothesis of no di erence between biking and jogging.H0 : E(Yjd;a = 2) = E(Yjd;a = 3)Ha : E(Yjd;a = 2)6= E(Yjd;a = 3)Note H0 and Ha can also be written asH0 : E(Yjd;a = 2) E(Yjd;a = 3) = 0Ha : E(Yjd;a = 2) E(Yjd;a = 3)6= 0Again, we don’t specify a value for d in H0 and Ha because we are assuming the nointeraction model is correct, which means E(Yjd;a = 2) E(Yjd;a = 3) does notdepend on the value of d.(a) Use the estimated regression function from either model 2, 3, or 4 to estimate thedi erence between biking and jogging. Speci cally, give an estimate ofE(Yjd;a = 2) E(Yjd;a = 3). Note that because there is no interaction in themodel, this estimate does not depend on the value of d. Therefore, you can plugin any value of d (using the same value for estimating both expected values).Because the data is balanced the estimate you get from the model is exactly thesame as if you just simply subtract the mean for jogging from mean for biking. Rcode for thismean(y[exercise==2])-mean(y[exercise==3])(b) Using the parameters from model 2 where the baseline group is walking, write thenull and alternative hypotheses as statements about the parameters ( 0; 1; 2; 3).(c) Using the parameters from model 3 where the baseline group is jogging, write thenull and alternative hypotheses as statements about the parameters ( 0; 1; 2; 3).(d) Using the parameters from model 4 (means parmaeterization), write the null andalternative hypotheses as statements about the parameters ( 1; 2; 3; 4).(e) Give the reduced model for the null hypothesis. Note that there are severaldi erent ways to parameterize the model.(f) Conduct a model comparison F test for this null hypothesis. Give the value ofthe test statistic F , the p-value, and your conclusion if the signi cance level is= 0:05(g) Reparameterize so that there is a single parameter that is equal toE(Yjd;a = 2) E(Yjd;a = 3).i. Write out this model making sure to clearly de ne all of your indicator vari-ables.ii. Run a regression using this model and compare the t and p-value for theparameter in your model that equals E(Yjd;a = 2) E(Yjd;a = 3) with theF and p-value from the model comparison test. Check that you get the samep-value you gave in part (f). Now wasn’t that easier to just reparameterizethe model and get the estimate you wanted and p-value straight from the Routput.No answer is required for this question.914. Suppose we took dose out of the model.Yi = 3 + 1xi1 + 2xi2 +&"i (5)xi1 = I(subject i walks)xi2 = I(subject i bikes)where I(statement) is an indicator variable: i.e. it is 1 if ‘statement’ is true and 0otherwise.Note that the answers to questions (a), (b), and (c) depend on whether or not thedata is balanced, and for this dataset they are. You can verify the data is balancedby calculating the covariance between dose and exercise to check that it is 0. And asa consequence the covariances between dose and each of the three indicator variable isalso 0.cov(dose,exercise)cov(dose,a1)cov(dose,a2)cov(dose,a3)(a) If you use the model given by equation (5) to estimate E(Yja = 2) E(Yja = 3),would you get the same value as you got in question 13(a)?(b) If you use the model given by equation (5) to calculate a p-value to test the nullhypothesis of no exercise e ect, would you get the same p-value as you did inquestion 13? If not would the p-value be larger or smaller?(c) If you are only interested in testing for an exercise e ect (and do not care aboutthe dose e ect), should you use a model with or a model without dose? Why?& 转自:http://ass.3daixie.com/2018052158549125.html

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 214,100评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,308评论 3 388
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 159,718评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,275评论 1 287
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,376评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,454评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,464评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,248评论 0 269
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,686评论 1 306
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,974评论 2 328
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,150评论 1 342
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,817评论 4 337
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,484评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,140评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,374评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,012评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,041评论 2 351

推荐阅读更多精彩内容

  • By clicking to agree to this Schedule 2, which is hereby ...
    qaz0622阅读 1,444评论 0 2
  • 很早就想过,如果有一天离开,我会以什么姿态来告别,挥手还是鞠躬,动情还是微笑。 罢了,故事还没讲完,结局...
    不将就的每一天阅读 156评论 0 0
  • 今天晚上我很窝火,原因是家里这个令人头痛的熊孩子。 其实要称他为“熊孩子”有点夸大其词。他从出生就很乖,婴儿时期很...
    孙小青阅读 313评论 0 1