在C++中,向量和矩阵的运算通常需要逐个元素进行, 或者调用相应的函数。 Rcpp通过C++的表达式模板(expression template)功能, 可以在C++中写出像R中对向量和矩阵运算那样的表达式。
R中的很多函数如sin等是向量化的, Rcpp糖也提供了这样的功能。 Rcpp糖提供了一些向量化的函数如ifelse, sapply等。
比如,两个向量相加可以直接写成x + y 而不是用循环或迭代器(iterator)逐元素计算; 若x是一个NumericVector, 用sin(x)可以返回由x每个元素的正弦值组成的NumericVector。
Rcpp糖不仅简化了程序, 还提高了运行效率。
from:http://www.math.pku.edu.cn/teachers/lidf/docs/Rbook/html/_Rbook/rcpp-sugar.html
===========================================================
Rcpp provides a lot of syntactic "sugar" to ensure that C++ functions work very similarly to their R equivalents. In fact, Rcpp sugar makes it possible to write efficient C++ code that looks almost identical to its R equivalent.
Rcpp提供了大量的语法“糖”,以确保c++函数的工作方式与对应的R函数非常相似。事实上,Rcpp sugar使编写高效的c++代码成为可能,这些代码看起来几乎与R代码相同。
- arithmetic and logical operators
- logical summary functions
- vector views
- other useful functions
arithmetic and logical operators
+, *, -, /, pow, <, <=, >, >= ,==, !=, !
use sugar to considerably simplify the implementation of pdistC().
pdistR <- function(x, ys){
sqrt((x - ys) ^ 2)
}
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector pdistC2(double x, NumericVector ys) {
return sqrt(pow((x - ys), 2));
}
logical summary functions
在R中, any()和all()对一个逻辑向量分别判断是否有任何真值, 以及所有元素为真值。 Rcpp糖在C++中也提供了这样的any()和all()函数。
The sugar function any() and all() are fully lazy so that any(x == 0), for example, might only need to evaluate one element of a vector, and return a special type that can be converted into a bool using .is_true(), .is_false(), or .is_na().
We could also use this sugar to write an efficient function to determine whether or not a numeric vector contains any missing values. To do this in R, we could use any(is.na(x)):
any_naR <- function(x) any(is.na(x))
However, this will do the same amount of work regardless of the location of the missing value. Here's the C++ implementation:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
bool any_naC(NumericVector x){
return is_true(any(is_na(x)));
}
> any_naR <- function(x) any(is.na(x))
> library(microbenchmark)
> microbenchmark(
+ any_naR(x0), any_naC(x0),
+ any_naR(x1), any_naC(x1),
+ any_naR(x2), any_naC(x2)
+ )
Unit: microseconds
expr min lq mean median uq max neval
any_naR(x0) 382.863 467.627 678.43969 485.2620 559.2180 11793.051 100
any_naC(x0) 319.716 325.974 416.80244 334.7915 372.6225 3362.698 100
any_naR(x1) 384.570 467.911 725.74845 486.1165 542.7200 11709.424 100
any_naC(x1) 320.854 325.974 370.31853 329.9560 352.9965 1200.355 100
any_naR(x2) 249.743 329.956 387.52749 351.5735 389.1205 1426.772 100
any_naC(x2) 1.707 2.846 28.16085 4.5520 7.3965 1530.879 100
Cpp更快
vector views
a "view" of a vector:
head(), tail(), rep_each(), rev(), seq_along(), and seq_len()
In R these would all produce copies of the vector, but in Rcpp they simply point to the existing vector and override the subsetting operator ([) to implement special behaviour. This makes them very efficient: for instance, rep_len(x, 1e6) does not have to make a million copies of x.
other useful functions
- Math functions
- Scalar summaries
- Vector summaries
- Finding values
- Dealing with dumplicates
- d/q/p/r for all standard distributions