Week 1: R
参考书籍:《Data Analysis for the Life Sciences》
参考视频:
开了个统计学的新坑 : )
Getting Started with R
- 对R基础语法不熟悉可以安装
install.packages("swirl")
library(swirl)
swirl()
- 重要的安装包
library(rafalib)
library(downloader) # 下载器
install.packages("devtools") # 连接Github
GitHub
https://github.com/genomicsclass
labs: 储存课程的源码
dagdata:含有课程所需的原始数据
Download from within R
- downloader
可以下载文件到当前Rproj目录或setwd()
library(downloader)
url <- "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/femaleMiceWeights.csv"
filename <- "femaleMiceWeights.csv"
download(url, destfile=filename)
dir <- "https://raw.githubusercontent.com/genomicsclass/dagdata/master/inst/extdata/"
filename <- "femaleMiceWeights.csv"
url <- paste0(dir, filename)
if (!file.exists(filename)) download(url,destfile=filename)
dat <- read.csv(url)
- devtools
library(devtools)
install_github("genomicsclass/dagdata")
#extracts the location of package
dir <- system.file(package="dagdata")
list.files(dir)
list.files(file.path(dir,"extdata"))
# [1] "admissions.csv" "astronomicalunit.csv" "babies.txt"
# [4] "femaleControlsPopulation.csv" "femaleMiceWeights.csv" "mice_pheno.csv"
# [7] "msleep_ggplot2.csv" "README" "spider_wolff_gorb_2013.csv"
# 由于不在当前文件夹需要指名绝对路径
filename <- file.path(dir,"extdata/femaleMiceWeights.csv")
dat <- read.csv(filename)
- Exercises
> 1
# 下面的题目都可以用dplyr简化
> 2
dat[12,2] # 确实没明白and怎么用
# [1] 26.25
> 3
dat$Bodyweight[11]
# [1] 26.91
> 4
length(dat$Bodyweight)
# [1] 24
> 5
mean(dat[seq(13,24),2])
# [1] 26.83417
> 6
set.seed(1)
sample(dat[seq(13,24),2],1)
# [1] 34.02
Brief Introduction to dplyr
- dplyr + unlist
unlist可以解除data.frame性质
If dplyr receives a
data.frame
it will return adata.frame
.
To obtain a numeric vector with dplyr, we can apply theunlist
function which turns lists, such as data.frames, into numeric vectors.
library(dplyr)
chowVals <- filter(dat, Diet=="chow") %>%
select(Bodyweight)
class(chowVals)
# [1] "data.frame"
chowVals <- filter(dat, Diet=="chow") %>%
select(Bodyweight) %>%
unlist()
class(chowVals)
# [1] "numeric"
- Exercises
> 1
dir <- system.file(package="dagdata")
filename <- file.path(dir,"extdata/msleep_ggplot2.csv")
dat <- read.csv(filename)
class(dat)
# [1] "data.frame"
> 2
primates <- dat %>%
filter(order == 'Primates')
nrow(primates)
# [1] 12
> 3
class(primates)
# [1] "data.frame"
> 4
primates_st <- primates %>%
select(sleep_total)
class(primates_st)
# [1] "data.frame"
> 5
primates %>%
select(sleep_total) %>%
unlist() %>%
mean()
# [1] 10.5
> 6
primates %>%
select(sleep_total) %>%
summarise(mean(sleep_total))
# mean(sleep_total)
# 1 10.5