Econ 325 (004)Winter Session, Term 1, 2019M. VaneyLab 2 - Demonstration of the Central Limit TheoremDue: Monday November 25. Submit your work online.PurposeIn this lab R is used to demonstrate the Central Limit Theorem, a theorem that provides atheoretical basis for estimation and inference even for underlying populations that are notnormally distributed. The lab reinforces the use of .do Öles as an e¢ cient way to execute aseries of commands and the use of loops to automate repetitive tasks. The lab also introducesa few additional R commands.Central Limit TheoremGiven a random sample of size n from underlying distribution f(x) with 1(Önite mean) and 0 approximately normal with X. This can also be expressed as limn!1 X.One implication of this for estimation is that even if the underlying distribution is not normallydistributed, by appealing to the Central Limit Theorem we may treat the sample mean,X�n; as an approximately normally distributed random variable. The following Ögure showsthe underlying distribution of a random variable X as a solid line. Clearly X is not normallydistributed. The random variable X has realizations only over the interval [0; 3] rather than(1;1);X is not symmetric, X is not uni-modal. However, taking random samples ofsize n and computing the sample mean for each di§erent random sample we see that thedistribution of the sample mean (red dashed line) has many of the features characteristic ofa normally distributed random variable (uni-modal, symmetric, bell-shaped).How closely the sample mean conforms to a normal distribution will depend on features ofthe underlying distribution and the sample size. The larger the sample size the more closelythe distribution of the sample mean will resemble a normally distributed random variable.Data and MethodologyA number of ëpopulationsíare provided. In order to demonstrate the CLT it will be necessaryto describe the distribution of the sample mean for each of the populations.DataThe Öle lab2-variables.csv contains N = 700 observations for each of 5 random variables(called x1; : : : ; x5). Each of these can be thought of as a di§erent Population with a givenunderlying distribution f(x1); g(x2); : : : ; k(x5).MethodsUse R to carry out the following tasks:1. (a) Generate summary statistics and create histograms for each of the 5 variables.(b) Draw 1000 random samples of size n = 4; 25 and 144 for each of the randomvariables (without replacement). Compute the sample mean for each randomsample and construct a histogram of the sample means..R commandsThis lab will make use of some commands that are found two additional packages availablein R: dplyr and ggplot2. Both of these packages must be loaded in R. You can check tosee which packages are loaded by selecting the packages tab in the lower right corner of thescreen. If a package has not been installed in the console the following command can beentered:install.packages(ggplot2)the ggplot2 package will be installed (it may take a minute or two)In order to make use of the additional commands available in a package your script Ölemust refer to the packages through a library commnad. It is best to start the script withspeciÖcation of the required packages:library(ggplot2)library(dplyr)The dplyr package has a number of commands that are useful for re-organizing data. Thecommand that we will use in this lab is sample_n(data, sample size)The ggplot2 package is used for making various graphs and Ögures. A very useful resourcefor creating histograms in ggplot2 can be found at the link provided in the Lab folder onCanvas.The sample_n() command will draw a single random sample (of rows of a dataset) of aspeciÖc size, n. To generate 1000 random samples, sample_n() command along with a2command to take the mean can be embedded in the command replicate() which will repeatthese commands a speciÖed number of times.Results and DiscussionPresent and provide some discussion of the following:Submit your .do Öle for this lab. Do not submit raw data.1. (a) Consider the summary statistics and graphics for the underlying populations. Dothe underlying distributions appear to be Normally distributed? Comment onthe apparent distributions of each of the variables (symmetric, skewed, numberof modes,di§erence between mean and median, etc.).(b) Discuss how changing the size of the sample alters the distribution of the samplemean for each of the di§erent variables. Do the results conform with the predictionof the Central Limit Theorem?3转自:http://www.daixie0.com/contents/18/4349.html
讲解:Econ 325、Limit Theorem、R、RJava|Prolog
©著作权归作者所有,转载或内容合作请联系作者
- 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
- 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
- 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...