首页 \ 问答 \ Rcpp功能比相同的R功能低(Rcpp function to be SLOWER than same R function)

Rcpp功能比相同的R功能低(Rcpp function to be SLOWER than same R function)

 我一直在编写一个R函数来计算关于某些分布的积分，参见下面的代码。  
EVofPsi = function(psi, probabilityMeasure, eps=0.01, ...){

distFun = function(u){
 probabilityMeasure(u, ...)
}
xx = yy = seq(0,1,length=1/eps+1)
summand=0

for(i in 1:(length(xx)-1)){
  for(j in 1:(length(yy)-1)){
    signPlus = distFun(c(xx[i+1],yy[j+1]))+distFun(c(xx[i],yy[j]))
    signMinus = distFun(c(xx[i+1],yy[j]))+distFun(c(xx[i],yy[j+1]))
    summand = c(summand, psi(c(xx[i],yy[j]))*(signPlus-signMinus))
  }
}
sum(summand)
}
 
 它工作正常，但它很慢。 通常听说用C ++等编译语言对函数进行重新编程会加快速度，特别是因为上面的R代码涉及双循环。 我也是这样，使用Rcpp：  
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
double EVofPsiCPP(Function distFun, Function psi, int n, double eps) {

  NumericVector xx(n+1);
  NumericVector yy(n+1);
  xx[0] = 0;
  yy[0] = 0;

  // discretize [0,1]^2
  for(int i = 1; i < n+1; i++) {
      xx[i] = xx[i-1] + eps;
      yy[i] = yy[i-1] + eps;
  }

  Function psiCPP(psi);
  Function distFunCPP(distFun);
  double signPlus;
  double signMinus;
  double summand = 0;

  NumericVector topRight(2); 
  NumericVector bottomLeft(2);
  NumericVector bottomRight(2);
  NumericVector topLeft(2);

  // compute the integral
  for(int i=0; i<n; i++){
    //printf("i:%d \n",i);
    for(int j=0; j<n; j++){
      //printf("j:%d \n",j);
      topRight[0] = xx[i+1];
      topRight[1] = yy[j+1];
      bottomLeft[0] = xx[i];
      bottomLeft[1] = yy[j];
      bottomRight[0] = xx[i+1];
      bottomRight[1] = yy[j];
      topLeft[0] = xx[i];
      topLeft[1] = yy[j+1];
      signPlus = NumericVector(distFunCPP(topRight))[0] +  NumericVector(distFunCPP(bottomLeft))[0];
      signMinus = NumericVector(distFunCPP(bottomRight))[0] + NumericVector(distFunCPP(topLeft))[0];
      summand = summand + NumericVector(psiCPP(bottomLeft))[0]*(signPlus-signMinus);
      //printf("summand:%f \n",summand);
    }
  }
  return summand;
}
 
 我很开心，因为这个C ++函数可以正常工作。 但是，当我测试了这两个函数时，C ++运行了SLOWER：  
sourceCpp("EVofPsiCPP.cpp")
pFGM = function(u,theta){
  u[1]*u[2] + theta*u[1]*u[2]*(1-u[1])*(1-u[2])
}
psi = function(u){
  u[1]*u[2]
}
print(system.time(
for(i in 1:10){
  test = EVofPsi(psi, pFGM, 1/100, 0.2)  
}
))
test

print(system.time(
  for(i in 1:10){
    test = EVofPsiCPP(psi, function(u){pFGM(u,0.2)}, 100, 1/100)  
  }
))
 
 那么，是否有一位善意的专家愿意向我解释这一点？ 我是否像猴子一样编码，并且有办法加快这一功能？ 此外，我还有第二个问题。 事实上，我可以用SEXP替换输出类型double，并且参数类型Function by SEXP也是如此，它似乎没有改变任何东西。 那么区别是什么呢？  
 非常感谢你，吉尔达斯 

I have been coding a R function to compute an integral with respect to certain distributions, see code below. 
EVofPsi = function(psi, probabilityMeasure, eps=0.01, ...){

distFun = function(u){
 probabilityMeasure(u, ...)
}
xx = yy = seq(0,1,length=1/eps+1)
summand=0

for(i in 1:(length(xx)-1)){
  for(j in 1:(length(yy)-1)){
    signPlus = distFun(c(xx[i+1],yy[j+1]))+distFun(c(xx[i],yy[j]))
    signMinus = distFun(c(xx[i+1],yy[j]))+distFun(c(xx[i],yy[j+1]))
    summand = c(summand, psi(c(xx[i],yy[j]))*(signPlus-signMinus))
  }
}
sum(summand)
}
 
It works fine, but it is pretty slow. It is common to hear that re-programming the function in a compiled language such as C++ would speed it up, especially because the R code above involves a double loop. So did I, using Rcpp: 
#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
double EVofPsiCPP(Function distFun, Function psi, int n, double eps) {

  NumericVector xx(n+1);
  NumericVector yy(n+1);
  xx[0] = 0;
  yy[0] = 0;

  // discretize [0,1]^2
  for(int i = 1; i < n+1; i++) {
      xx[i] = xx[i-1] + eps;
      yy[i] = yy[i-1] + eps;
  }

  Function psiCPP(psi);
  Function distFunCPP(distFun);
  double signPlus;
  double signMinus;
  double summand = 0;

  NumericVector topRight(2); 
  NumericVector bottomLeft(2);
  NumericVector bottomRight(2);
  NumericVector topLeft(2);

  // compute the integral
  for(int i=0; i<n; i++){
    //printf("i:%d \n",i);
    for(int j=0; j<n; j++){
      //printf("j:%d \n",j);
      topRight[0] = xx[i+1];
      topRight[1] = yy[j+1];
      bottomLeft[0] = xx[i];
      bottomLeft[1] = yy[j];
      bottomRight[0] = xx[i+1];
      bottomRight[1] = yy[j];
      topLeft[0] = xx[i];
      topLeft[1] = yy[j+1];
      signPlus = NumericVector(distFunCPP(topRight))[0] +  NumericVector(distFunCPP(bottomLeft))[0];
      signMinus = NumericVector(distFunCPP(bottomRight))[0] + NumericVector(distFunCPP(topLeft))[0];
      summand = summand + NumericVector(psiCPP(bottomLeft))[0]*(signPlus-signMinus);
      //printf("summand:%f \n",summand);
    }
  }
  return summand;
}
 
I'm pretty happy since this C++ function works fine. However, when I tested both functions, the C++ one ran SLOWER: 
sourceCpp("EVofPsiCPP.cpp")
pFGM = function(u,theta){
  u[1]*u[2] + theta*u[1]*u[2]*(1-u[1])*(1-u[2])
}
psi = function(u){
  u[1]*u[2]
}
print(system.time(
for(i in 1:10){
  test = EVofPsi(psi, pFGM, 1/100, 0.2)  
}
))
test

print(system.time(
  for(i in 1:10){
    test = EVofPsiCPP(psi, function(u){pFGM(u,0.2)}, 100, 1/100)  
  }
))
 
So, is there some kind expert around willing to explain me this? Did I code like a monkey and is there a way to speed up that function? Moreover, I would have a second question. Indeed, I could have replaced the output type double by SEXP, and the argument types Function by SEXP as well, it doesn't seem to change anything. So what is the difference? 
Thank you very much in advance, Gildas 

原文：https://stackoverflow.com/questions/17958168

更新时间：2024-01-31 15:01

最满意答案

恭敬地 - Adrian K和Dima的答案都不正确。 正确的答案是使用Windows事件跟踪 （ETW）。 这是我们用于Windows中的所有日志记录。 它非常强大，表现非常好。 例如W7在很多操作系统事件上记录ETW事件 - 包括处理器上下文切换。 在W7中使用过性能监视器吗？ 它正在消耗内核中的ETW事件。
我建议您使用ETW进行所有日志记录。 为什么？ 几个原因：

它无处不在
您可以在运行过程中启用禁用日志记录。 无需重新启动过程。 （是的，其他伐木工这样做，但有些不这样做）。
其设计用于包含运输代码。
记录一个事件保证是非阻塞的：它不会导致“等待”。
我们提供了许多用于ETW跟踪处理的工具。 最值得注意的是XPERF工具（ 链接 ， 链接 ， 链接 ）

使用ETW事件处理性能路径的一大好处是可以将您的事件看作与使用XPERF工具的内核事件不可分割的一部分。
编写一个监视你的组件ETW事件的'watch'应用程序也很容易。 对于我们的组件之一，我只有一个组件可以简单地将事件显示到控制台。
我强烈建议不要尝试编写自己的高性能日志记录系统。 这样做很难做到，但在性能和可靠性方面。 Windows ETW系统超强健，性能良好。

Respectfully - both Adrian K and Dima's answers are not correct. The right answer is to use Event Tracing For Windows (ETW). This is what we use for all logging in Windows. Its extremely robust and very well performing. For example W7 logs an ETW event on many OS events - all the time - including processor context switch. Ever use the performance monitor in W7? It is consuming ETW events from the kernel.
I recommend you do all your logging with ETW. Why? Several reasons:

Its ubiquitous
You can enable disable logging in your running process. No process restarts required. (yes, other loggers do this, but some do not).
Its designed for including in shipping code.
Logging an event is guaranteed to be non-blocking: it will not cause a 'wait'.
We provide lots of tools for ETW trace processing. most notably the XPERF tools (link, link, link)

A big benefit of instrumenting your performance paths with ETW events is that your events can be seen integral with the kernel events using the XPERF tools.
Its also pretty easy to write a 'watch' application that watches ETW events from your components. I have one for one of our components that simply displays the events to the console.
I highly recommended to not try and write your own high performance logging system. This is challenging to do well, but in terms of performance and reliability. The Windows ETW system is super-robust and very well performing.

Rcpp功能比相同的R功能低(Rcpp function to be SLOWER than same R function)

最满意答案

相关问答

代码记录比率？(Code to logging ratio? [closed])[2024-02-19]

你发现java.util.logging是否足够？(Do you find java.util.logging sufficient? [closed])[2023-12-30]

怎么了登录Java？(What's Up with Logging in Java? [closed])[2022-02-01]

存储日志并实时向客户端浏览器发送最新活动的最佳方式[关闭](Best way to store a log and send newest activities to client browser in realtime [closed])[2022-07-07]

实时日志记录[关闭](realtime logging [closed])[2023-04-04]

性能记录库[关闭](Performance logging library [closed])[2022-11-09]

使用Gibraltar启用实时日志记录？(Enabling live logging with Gibraltar?)[2020-02-09]

为什么芹菜任务后没有关闭日志文件(why aren't logging files closed after celery tasks)[2021-10-09]

崩溃检测，日志记录和C ++ [关闭](Crash detection, logging, and C++ [closed])[2021-12-31]

stderr关闭后，System.setErr（）不会恢复控制台日志记录(System.setErr() doesn't restore console logging after stderr closed)[2023-10-04]

相关文章

最新问答