Assignment 4 – Module 31. InstructionsThis assignment is worth a total of 9 points toward your final grade. It will consist of two sections. InSection 1, you will work with trade input and output data and learn how to manipulate them. InSection 2, you will learn cluster analysis and work with some health data.1) Course Materials – Jim has made an R textbook available on canvas. Go to “Library OnlineCourse Reserves” and you will find an e-book “R: predictive analysis: master the art ofpredictive modeling” made available from 2019-10-01 to 2019-12-23. Before beginning thisassignment please spend some time reading the relevant sections of the textbook, especiallyChapters 1.3 (visualization methods) and 2.3 (cluster analysis). Students taking the dataanalytics module next semester may want to study the book more during their holiday break.2) Submission of assignment – you will be given two ways to submit your assignment:a. RMarkdown format: you can submit your assignment as an RMarkdown file (.RMD).Make sure to describe clearly in the file the steps for you code including explanation onwhy you used a certain code / function.i. The advantage that RMarkdown has over Word is you do not need to worryabout the formatting. Just type your comments and code as you would in ascript file and the package will help you knit everything into a html document.Outputs and graphs will also automatically generate under your code boxeswhen you run them. However, you will still need to know the syntax for creatingcode boxes. All codes must be bounded by the following symbols:```{r}```ii. The following YouTube video teaches some basics of using RMarkdown. Taketime to watch the video and decide if you want to use RMarkdown after.https://www.youtube.com/watch?v=DNS7i2m4sB0b. PDF format: you can also submit your assignment in PDF format. Copy and pastesnippets of your codes to go with your explanations. Your answers should follow thefollowing format.Text explanations should be in black against a white background.Codes / scripts should be shown in black letters in a grey box like this.This will enable us to more easily differentiate between codes and explanations. Always provide explanations for your codes.For visual outputs (graphs, screenshots, etc.), you can try using UBC’s free Snagitscreen capture program. In the leftmost column of your Canvas account, click on “Help”>>> “Software Distribution”. Choose the “Snagit” application, add it to your cart andfollow the download and installation instructions.3) Assignment due date – this assignment will be due at 11.59am on December 2nd 2019.4) If you have any questions with regards to the assignment, you can contact either Hamzeh orWei Siang. Their emails and office hours are as follow:a. Hamzeh – seh793@mail.usask.ca, Mondays & Wednesdays 10.30am-12 Noon atMCML154.b. Wei Siang – weisiang.chan@gmail.com, Tuesdays 10.30am-12 Noon at MCML154.5) If you face problems with your code, send an email to Hamzeh. Include in your email:a. Your full code;b. The error message shown in your console; andc. Indicate the line at which the problem appeared (if possible).6) The data for this assignment can be downloaded from Canvas:a. On the FRE 501 home page, click on “Canvas module” under “Module 3 (Hamzeh)”b. Scroll down to “Data” and you’ll see the files you’ll need to download for thisassignment.c. Find the file titled “wiot_stats_sep12.zip”d. Download the file onto your computer. As the file is very big (260.2 MB), the downloadmay take some time.e. Unzip the file. Doubleclick on the zip file and click “extract all”. A new folder will becreated with the unzipped files.f. open the file in R (DO NOT use Excel as this will hang the program). Open RStudio andselect “file”, “import dataset”, and “from Stata…”.g. A new window will pop up. Browse the unzipped folder and select the Stata file titled“woit_full”.h. Cancel the data preview (or your computer will take a very long time to load the data).Click “import”. The dataset should download into RStudio.2. Section 1 – Working with Trade Input / Output DataWith the United States–China Relations Act of 2000, China was allowed to join WTO in 2001. BillClinton the president of USA in 2000 put too much effort to convince the U.S Congress to approve thetrade agreement between the U.S and China. Clinton believed higher levels of trade with China was inthe favour of U.S economy. However, in general American authorities argue that China hinders opentrade and does not open its market to the U.S as the U.S does.Y代写data、代做R程序设计、代写R编程语言、analysiour task is to provide some preliminary evidence about the claims made by the U.S authorities aboutFood Industry in both countries. Please use package “tidyverse” to conduct your analyses. POINTS:◦ Question 1-1: 3/100◦ Question 1-2: 2/100◦ Question 1-3: 5/100◦ Question 1-4: 10/100◦ Question 1-5: 30/1001-1. Use WIOT dataset to make two subsample of WIOT. In the first subsample we are looking for thecontribution of the U.S agricultural sector (row_item=1 and 64) in the value added of China’s foodindustry (col_item=3). The second subsample includes the contribution of the China’s agriculturalsector (row_item=1 and 64) in the value added of U.S food industry (col_item=3). (consult slides 18 to23 at the GVC_RCA lecture notes)1-2. Calculate the share of agriculture industry in the value added of food industry for each subsampleyou made (consult slides 24 at the GVC_RCA lecture notes).1-3. Make two graphs showing the changes in the share agricultural industry in the value added of foodindustry from 1996 to 2010 for each subsample made (consult slides 26 to 33 at the GVC_RCA lecturenotes). Use package gridExtra to combine the graphs1-4. In a short paragraph explain whether the U.S authorities’ claims seems to be true and WTO needsto conduct an investigation or it is a wrong statement. In specific focus on the trends of both graphsbefore and after 2001 when China joined WTO.1-5. Find the share of Chias’ agricultural industry in the total output values of agricultural industry andfood industry of all countries from 1995 till 2011. (HINT-1. use group_by and summarise functions.HINT 2: group by several variables). Plot your findings where the Y axis is % share of agriculturalindustry of China in the total output value of agricultural industry and food industry of all countriesand X axis is the year.3. Section 2 – Cluster Analysis of Health DataThere is a variable in the cluster_data dataset called inc_hh. This variable is a categoricalvariable ranging from 1 to 8. It shows the household income level for each individual. Ifinc_hh=1 it means the annual household income of the individual in the dataset is between $0to $19,999; consequently inc_hh=7 means the annual income level of the individual isbetween $120,000 to $139,999. The final income level (inc_hh=8) is related to thoseCanadians whose annual household income is equal or greater than $140,000.In the class, we found the dietary patterns of all Canadian adults in the dataset. The questionsbelow can be answered by the use of your lecture notes.• Points◦ Question 2-1 : 15/100◦ Question 2-2: 10/100◦ Question 2-3: 10/100◦ Question 2-4: 15/1001- Please use kmean cluster analysis to identify the dietary patterns of those individuals withthe lowest income level (i.e. inc_hh=1) and income level of between $120,000 to $139,999(i.e. inc_hh==7). Report the average intakes of 9 food groups (using the food dataset we usedin the class) across these two income groups (1 and 7). (use dyplr package for datamanagement, fviz_nbclust and NbClust to find the optimal number of clusters, kmeans function to conduct kmean cluster analysis. Please consult slide 58 to 66 of cluster_analysislecture notes)2- In the main dataset we have two variables called bmi_total and nrf. The first variableindicate the body mass index (BMI) of each individual and the second variable indicate thediet quality score of each individual based on Nutrient Rich Food index. Please find theaverage BMI and NRF across clusters identified for each income groups separately usinggroup_by and summaries functions. (please consult slide 77 of cluster_analysis lecture notes)3- Compare the frequencies of those Canadians who have High Quality diet across two incomegroups using freq function. Also “descr” package to report the prevalence of males with highquality diet in each income groups (please consult slide 67 to 76 of cluster_analysis lecturenotes).4. People in the lowest income groups tend to be more obese than those in the highest incomegroups. Adam believes because healthier food options are more expensive, poor people tend toeat more of unhealthy foods therefore, they are likely to be more obese. However, Bill arguesthat because of the technological advancements in agricultural sector, foods are available formost of the people in developed countries in relatively low prices. So, we cannot blame lowerprices of unhealthy foods for higher prevalence of obesity among poor people. Using youranswers to questions 1 and 3 in a short paragraph discuss whether you support Adam or Bill?转自:http://www.6daixie.com/contents/18/4480.html
讲解:data、R、R、analysisPython|R
©著作权归作者所有,转载或内容合作请联系作者
- 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
- 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
- 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
推荐阅读更多精彩内容
- By clicking to agree to this Schedule 2, which is hereby ...