1、R中如何实现Python中itertools.combinations
的效果?
R中函数combn
可实现类似的效果:
> s <- LETTERS[1:3]
> combn(s,2)
[,1] [,2] [,3]
[1,] "A" "A" "B"
[2,] "B" "C" "C"
# 最后进行一个转置便可以得到排列组合效果
> t(combn(s,2))
[,1] [,2]
[1,] "A" "B"
[2,] "A" "C"
[3,] "B" "C"
2、R中如何实现将字符串转换为变量名,类似于python中的eval()
函数效果?
在R中可以使用get()
函数来达到相同的效果:
# 在这个例子中我们使用一个含列名的向量来进行遍历
> library(dplyr)
> s <- LETTERS[1:3]
> fac <- as.data.frame(t(combn(s,2)), stringsAsFactors=FALSE)
> str(fac)
'data.frame': 3 obs. of 2 variables:
$ V1: chr "A" "A" "B"
$ V2: chr "B" "C" "C"
> df <- as.data.frame(matrix(c(c(2,4,6),c(2,6,8),c(3,4,6),c(3,3,3)),nrow=4))
> colnames(df) <- c('A','B','C')
> df
A B C
1 2 6 6
2 4 8 3
3 6 3 3
4 2 4 3
# 最后进行过滤,挑选一行中所有列都是偶数的行来
> for (n in 1:ncol(fac)){
+ for(arg in as.vector(fac[n,])){
+ print(arg)
+ df <- dplyr::filter(df, get(arg)%%2==0)
+ print(df)
+ }
+ }
[1] "A"
A B C
1 2 6 6
2 4 8 3
3 6 3 3
4 2 4 3
[1] "B"
A B C
1 2 6 6
2 4 8 3
3 2 4 3
[1] "A"
A B C
1 2 6 6
2 4 8 3
3 2 4 3
[1] "C"
A B C
1 2 6 6
上面的若不使用get()
函数则会报错Evaluation error: non-numeric argument to binary operator.
,因为此时arg就是字符串,无法与2进行运算。
3、如何将list
转化为向量(vector
)?
使用as.vector(unlist(my_list))
。
> df<- data.frame(v1=c(1,2,3),v2=c(2,8,9),v3=c(1,2,5))
> a <- as.vector(unlist(df[1,]))
> a
[1] 1 2 1
> df[a]
v1 v2 v1.1
1 1 2 1
2 2 8 2
3 3 9 3
4、如何合并多个数据框
在R中可以使用purrr
包中的reduce
函数:
library(dplyr)
x <- data_frame(i = c("a","b","c"), j = 1:3)
y <- data_frame(i = c("b","c","d"), k = 4:6)
z <- data_frame(i = c("c","d","a"), l = 7:9)
list(x, y, z) %>% reduce(full_join, by = "i")
# A tibble: 4 x 4
# i j k l
# <chr> <int> <int> <int>
# 1 a 1 NA 9
# 2 b 2 4 NA
# 3 c 3 5 7
# 4 d NA 6 8
list(x, y, z) %>% reduce(inner_join, by = "i")
# A tibble: 1 x 4
# i j k l
# <chr> <int> <int> <int>
# 1 c 3 5 7
在python中同样的使用reduce
函数可以达到相同的效果:
import pandas as pd
from functools import reduce
df1 = pd.read_table('file1.csv', sep=',')
df2 = pd.read_table('file2.csv', sep=',')
df3 = pd.read_table('file3.csv', sep=',')
# compile the list of dataframes you want to merge
data_frames = [df1, df2, df3]
df_merged = reduce(lambda left,right: pd.merge(left,right,on=['DATE'], how='outer'), data_frames)