将 Fortran、C++ 与 R 集成

2022-01-14 00:00:00 fortran r c++ rcpp armadillo

我的任务是用 C++ 重写一个 R 函数来加速 while 循环.除了 .Fortran() 之外,所有的 R 代码都在 Rcpp 和 Armadillo 的帮助下重写了.我首先尝试使用 Rinside,但正如 Dirk 所指出的那样,它以非常慢的速度工作.(数据经过 R -> C++ -> R -> Fortran 代价高昂)

My task it to rewrite a R function in C++ to accelerate the while loops. All R codes has been rewritten in the help of Rcpp and Armadillo except the .Fortran(). I try to use Rinside to at first and it works at a very slow speed as Dirk indicated. (It is expensive for data to go through R -> C++ -> R -> Fortran)

由于我不想用 C++ 重写 Fortran 代码,反之亦然,通过将 C++ 直接链接到 Fortran 来加速程序看起来很自然:R -> C++ -> Fortran.

Since I don't want to rewrite the Fortran codes in C++ and vice versa, it looks natural to accelerate the programs by linking C++ directly to Fortran: R -> C++ -> Fortran.

// [[Rcpp::depends(RcppArmadillo)]]

#include <RcppArmadillo.h>
using namespace Rcpp;

extern "C"{
   List f_(int *n,NumericMatrix a, NumericVector c, double* eps);
}

问题是我可以将 C++ 与 Fortran 集成,将 R 与 C++ 集成,但我无法让这三个东西一起工作!

The problem is that I can integrate C++ with Fortran and integrate R with C++, but I can't make these three things work together!

我尝试在 Linux 中编译 C++,但找不到 RcppArmadillo.hnamespace Rcpp:

I try to compile the C++ in Linux but it just can't find RcppArmadillo.h and namespace Rcpp:

 error: RcppArmadillo.h: No such file or directory
 error: 'Rcpp' is not a namespace-name

当我在 R 中直接调用 sourceCpp("test.cpp") 时,控制台会显示:

When I call sourceCpp("test.cpp") in R directly, the console would display:

test.o:test.cpp:(.text+0x20b2): undefined reference to `f_'
collect2: ld returned 1 exit status
Error in sourceCpp("test.cpp") : Error occurred building shared library.

我也尝试将所有这些东西组合在一个包中

I also try to combine all these things in a package by

RcppArmadillo::RcppArmadillo.package.skeleton("TTTest")

但是我添加了.cpp之后,我不知道如何处理TTTest包(我相信它无法安装).f 文件到 /src 并运行 compileAttributes.

But I don't know how to deal with the package TTTest (I believe it could not be installed) after I add the .cpp and .f files to /src and run compileAttributes.

那么,有没有可能像我想象的那样做 Rcpp 的事情?还是需要将 Fortran 代码转换为 C/C++ 代码?

So, is it possible to do things like what I imagine by Rcpp? Or it is necessary to convert Fortran codes to C/C++ codes?

感谢您的帮助.

推荐答案

我建议此类项目将您的代码打包到一个包中.我创建了一个名为 mixedlang 的包的简单示例,可在 this GitHub repo.我将在这里描述创建包的过程.

I would suggest for such projects to roll your code into a package. I created a simple example of such a package I called mixedlang that is available at this GitHub repo. I will describe the process of creating the package here.

我采取的步骤如下:

  1. 使用 RcppArmadillo::RcppArmadillo.package.skeleton("mixedlang") 从 R 设置包结构(我只使用 RcppArmadillo 而不是 Rcpp,因为 OP 是 - 没有任何犰狳特定于这个例子)
  2. 将下述 C++ 和 Fortran 代码文件添加到 src/ 文件夹
  3. 在 R 中,运行 Rcpp::compileAttributes("mixedlang/") 然后 devtools::install("mixedlang/")
  1. Set up the package structure from R with RcppArmadillo::RcppArmadillo.package.skeleton("mixedlang") (I only used RcppArmadillo rather than Rcpp since the OP was -- there's nothing Armadillo specific to this example)
  2. Added the C++ and Fortran code files described below to the src/ folder
  3. In R, run Rcpp::compileAttributes("mixedlang/") then devtools::install("mixedlang/")

代码

我创建了一个简单的 C++ 函数,其唯一目的(本质上)是调用 Fortran 函数.示例函数接受一个数值向量,将每个元素乘以其索引,然后返回结果.首先我们看一下Fortran代码:

The Code

I created a simple C++ function whose only purpose (essentially) was to call a Fortran function. The example function takes in a numeric vector, multiplies each element by its index, and returns the result. First let's look at the Fortran code:

这个函数只接受两个双精度数并将它们相乘,然后返回结果:

This function just takes in two doubles and multiplies them, returning the result:

REAL*8 FUNCTION MULTIPLY (X, Y) 
REAL*8 X, Y
MULTIPLY = X * Y
RETURN
END

test_function.cpp

现在我们需要从我们的 C++ 代码中调用这个 Fortran 代码.这样做时,我们需要考虑以下几点:

test_function.cpp

Now we need to call this Fortran code from our C++ code. When doing this, we need to take into account a few things:

  1. Fortran 参数通过引用而不是值传递.
  2. 由于 MULTIPLY 是在另一个文件中定义的,我们需要在我们的 C++ 文件中声明它,以便编译器知道参数和返回类型.

  1. Fortran arguments are passed by reference, not by value.
  2. Since MULTIPLY is defined in another file, we need to declare it in our C++ file so the compiler knows the argument and return types.

一个.在为我们的 C++ 文件声明 Fortran 函数时,我们将去掉函数名的大小写并附加一个下划线,因为 Fortran 编译器默认应该这样做.

a. When declaring the Fortran function for our C++ file, we'll drop the case of the function name and append an underscore, since the Fortran compiler should do this by default.

b.我们必须在 extern "C" 链接规范中声明该函数;C++ 编译器通常不能使用函数名作为唯一标识符,因为它允许重载,但是为了调用 Fortran 函数,我们需要它完全完成 extern "C" 链接规范所完成的工作(例如,参见这个答案).

b. We have to declare the function within an extern "C" linkage specification; C++ compilers cannot typically use function names as unique identifiers since it allows overloading, but for calling Fortran functions, we need it to do exactly that which the extern "C" linkage specification accomplishes (see, for example, this SO answer).

#include "RcppArmadillo.h"

// [[Rcpp::depends(RcppArmadillo)]]

// First we'll declare the MULTIPLY Fortran function
// as multiply_ in an extern "C" linkage specification
// making sure to have the arguments passed as pointers.
extern "C" {
    double multiply_(double *x, double *y);
}

// Now our C++ function
// [[Rcpp::export]]
Rcpp::NumericVector test_function(Rcpp::NumericVector x) {
    // Get the size of the vector
    int n = x.size();
    // Create a new vector for our result
    Rcpp::NumericVector result(n);
    for ( int i = 0; i < n; ++i ) {
        // And for each element of the vector,
        // store as doubles the element and the index
        double starting_value = x[i], multiplier = (double)i;
        // Now we can call the Fortran function,
        // being sure to pass the address of the variables
        result[i] = multiply_(&starting_value, &multiplier);
    }
    return result;
}

示例输出

安装包后,我作为例子运行

Example output

After installing the package, I ran as an example

mixedlang::test_function(0:9)
# [1]  0  1  4  9 16 25 36 49 64 81

原始发帖者问题的可能来源

  1. 最初尝试编译时,他们没有让编译器知道 RcppArmadillo.h 在哪里.
  2. 试图用 sourceCpp 来做这件事只是自找麻烦;它并不是真正用于处理多个文件(例如,参见 this answer by Dirk Eddelbuettel),这是处理多种语言时所必需的.
  1. When attempting to compile initially, they did not let the compiler know where RcppArmadillo.h was.
  2. Trying to do this with sourceCpp is just asking for trouble; it wasn't really made to handle multiple files (see for example this answer by Dirk Eddelbuettel), which is necessary when dealing with multiple languages.

我不确定当他们试图将它卷成一个包时发生了什么,这就是我写这个例子的原因.

I'm not sure what happened when they tried to roll it into a package, which is why I drew up this example.

相关文章