Subsetting and Sorting
Subsetting
Logicals ands and ors
Sorting
Ordering
注意order()函数返回的是当前位置的值现在所在的位置,例如降序排列,第1个值应该为最大值,最大值目前所在位置在23,所以order返回的第一个值为23
Ordering with ply
Adding rows and columns
Summarizing data
Look at a bit of the data
head()
\ tail()
Make summary
summary()
Mpre in depth information
str
Quantiles of quantitative variables
quantile
Make table
table
也可以生成二维表格模式
Check for missing values
sum()
\ any()
\ all()
Row and column sums
Values with specific characteristics
%in%
Cross tabs
Flat tables
Size of a data set
Creating New Variables
Creating sequences
Subsetting variables
Creating binary variables
Creating categorical variables
Easier cutting
Hmisc::cut2
Creating factor variables
Levels of factor variables
Cutting produces factor variables
Using the mutate function
Common transforms
Reshaping data
Start with reshaping
Melting data frames
Casting data frames
Averaging values
Another way
spIns = split(InsectSprays$count,InsectSprays$spray)
sprCount = lapply(spIns,sum)
unlist(sprCount)
sapply(spIns,sum)
Another way 2
ddply(InsectSprays,(spray ), summarize, sumsum( count))
dplyr
Verbs
- select: return a subset of the columns of a data frame
- filter: extract a subset of rows from a data frame based on logical conditions
- arrange: reorder rows of a data frame
- rename: rename variables in a data frame
- mutate: add new variables/columns or transform existing variables
- summarise / summarize: generate summary statistics of different variables in the data frame, possibly within strata
Functions
select()
\ filter()
\ arrange()
\ rename()
\ mutate()
\ group_by()
\ %>%
Merging data
Merging data - merge()
- Merges data frames
-
Important parameters: x, y, by, byx, by y, all
Using join in the plyr package
参考
R语言常用包汇总 - 望着小月亮 - 博客园 (cnblogs.com)
R之描述统计---summary函数,psych包与Hmisc包的区别_MC_manchang的博客-CSDN博客_psych包
R语言缺失值处理(MICE/Amelia/missForest/Hmisc/mi
A quick primer on split-apply-combine problems | R-bloggers
R tutorial on the Apply family of functions