Sie sind auf Seite 1von 44

Calling C code from R

an introduction

Sigal Blay
Dept. of Statistics and Actuarial Science
Simon Fraser University
October 2004

Motivation:
Speed
Efficient memory management
Using existing C libraries

The following functions provide a


standard interface to compiled code
that has been linked into R:

.C
.Call
.External

We will explore using .C and .Call with 7 code examples:


Using .C
I.

Calling C with an integer vector

II. Calling C with different vector types


Using .Call
III. Sending R integer vectors to C
IV. Sending R character vectors to C
V. Getting an integer vector from C
VI. Getting a character vector from C
VII. Getting a list from C
And lastly, tips on creating an R package with compiled code

I.
Calling C with an integer vector
using .C

/* useC1.c

*/

void useC(int *i) {


i[0] = 11;
}

The C function should be of type void.


The compiled code should not return anything
except through its arguments.

To compile the c code, type at the


command prompt:
R CMD SHLIB useC1.c
The compiled code file name is useC1.so

In R:
> dyn.load("useC1.so")
> a <- 1:10

# integer vector

> a
[1]

9 10

> out <- .C("useC", b = as.integer(a))


> a
[1]

9 10

9 10

> out$b
[1] 11

You have to allocate memory to the vectors


passed to .C in R by creating vectors of the
right length.
The first argument to .C is a character
string of the C function name.
The rest of the arguments are R objects to
be passed to the C function.

All arguments should be coerced to the


correct R storage mode to prevent
mismatching of types that can lead to
errors.
.C returns a list object.
The second .C argument is given the name
b. This name is used for the respective
component in the returned list object
(but not passed to the compiled code).

II.
Calling C with different vector types
using .C

/* useC2.c

*/

void useC(int *i, double *d, char **c, int *l) {


i[0] = 11;
d[0] = 2.333;
c[1] = "g";
l[0] = 0;
}

To compile the c code, type at the command prompt:


R CMD SHLIB useC2.c
to get useC2.so
To compile more than one c file:
R CMD SHLIB file1.c file2.c file3.c
to get file1.so

In R:
> dyn.load("useC2.so")
> i <- 1:10

# integer vector

> d <- seq(length=3,from=1,to=2) # real number vector


> c <- c("a", "b", "c")

# string vector

> l <- c("TRUE", "FALSE")

# logical vector

> i
[1]

> d
[1] 1.0 1.5 2.0
> c
[1] "a" "b" "c"
> l
[1] "TRUE"

"FALSE"

9 10

> out <- .C("useC",


i1 = as.integer(a),
d1 = as.numeric(d),
c1 = as.character(c),
l1 = as.logical(l))
> out
$i1
[1] 11 2 3 4 5 6 7 8 9 10
$d1
[1] 2.333 1.500 2.000
$c1
[1] "a" "g" "c
$l1
[1] FALSE FALSE

Other R objects can be passed to .C but it is


better to use one of the other interfaces.
With .C, the R objects are copied before
being passed to the C code, and copied again
to an R list object when the compiled code
returns.
Neither .Call nor .External copy their
arguments.
You should treat arguments you receive
through these interfaces as read-only.

Advantages to using .Call() instead of .C()


(Posted by Prof Brian Ripley on R-help, Jun 2004)
1) A lot less copying.
2) The ability to dimension the answer in the C code.
3) Access to other types, e.g. expressions, raw type and
the ability to easily execute R code (call_R is a pain).
4) Access to the attributes of the vectors, for example the names.
5) The ability to handle missing values easily.

III.
Sending R integer vectors to C
using .Call

/* useCall1.c

*/

#include <R.h>
#include <Rdefines.h>
SEXP getInt(SEXP myint, SEXP myintVar) {
int Imyint, n; // declare an integer variable
int *Pmyint;

// pointer to an integer vector

PROTECT(myint = AS_INTEGER(myint));

Rdefines.h is somewhat more higher level then


Rinternal.h, and is preferred if the code might be
shared with S at any stage.
SEXP stands for Simple EXPression
myint is of type SEXP, which is a general type,
hence coercion is needed to the right type.
R objects created in the C code have to be
reported using the PROTECT macro on a pointer
to the object. This tells R that the object is in use
so it is not destroyed.

Imyint = INTEGER_POINTER(myint)[0];
Pmyint = INTEGER_POINTER(myint);
n = INTEGER_VALUE(myintVar);
printf( Printed from C: \n);
printf( Imyint: %d \n", Imyint);
printf( n: %d \n", n);
printf( Pmyint[0], Pmyint[1]: %d %d \n",
Pmyint[0], Pmyint[1]);
UNPROTECT(1);
return(R_NilValue);
}

The protection mechanism is stackbased, so UNPROTECT(n) unprotects


the last n objects which were protected.
The calls to PROTECT and UNPROTECT
must balance when the user's code
returns.
to work with real numbers, replace
int with double
and
INTEGER
with NUMERIC

In R:
> dyn.load("useCall1.so")
> myint<- c(1,2,3)
> out<- .Call("getInt", myint, 5)
Printed from C:
Imyint: 1
n: 5
Pmyint[0], Pmyint[1]: 1 2
> out
NULL

IV.
Reading an R character vector from C
using .Call

/* useCall2.c

*/

#include <R.h>
#include <Rdefines.h>
SEXP getChar(SEXP mychar) {
char *Pmychar[5];

// array of 5 pointers
// to character strings

PROTECT(mychar = AS_CHARACTER(mychar));

// allocate memory:
Pmychar[0] = R_alloc(strlen(CHAR(STRING_ELT(mychar, 0))),
sizeof(char));
Pmychar[1] = R_alloc(strlen(CHAR(STRING_ELT(mychar, 1))),
sizeof(char));

// ... and copy mychar to Pmychar:


strcpy(Pmychar[0], CHAR(STRING_ELT(mychar, 0)));
strcpy(Pmychar[1], CHAR(STRING_ELT(mychar, 1)));
printf( Printed from C:);
printf( %s %s \n",Pmychar[0],Pmychar[1]);
UNPROTECT(1);
return(R_NilValue);
}

In R:
> dyn.load("useCall2.so")
> mychar <- c("do","re","mi", "fa", "so")
> out <- .Call("getChar", mychar)
Printed from C: do re

V.
Getting an integer vector from C
using .Call

/* useCall3.c

#include <R.h>
#include <Rdefines.h>
SEXP setInt() {
SEXP myint;
int *p_myint;
int len = 5;
// Allocating storage space:
PROTECT(myint = NEW_INTEGER(len));

*/

p_myint = INTEGER_POINTER(myint);
p_myint[0] = 7;
UNPROTECT(1);
return myint;
}
// to work with real numbers, replace
// int with double and INTEGER with NUMERIC

In R:
> dyn.load("useCall3.so")
> out<- .Call("setInt")
> out
[1] 7 0 0 0 0

VI.
Getting a character vector from C
using .Call

/* useCall4.c

*/

#include <R.h>
#include <Rdefines.h>
SEXP setChar() {
SEXP mychar;
PROTECT(mychar = allocVector(STRSXP, 5));
SET_STRING_ELT(mychar, 0, mkChar("A"));
UNPROTECT(1);
return mychar;
}

In R:
> dyn.load("useCall4.so")
> out <- .Call("setChar")
> out
[1] "A" ""

""

""

""

VII.
Getting a list from C
using .Call

/* useCall5.c

*/

#include <R.h>
#include <Rdefines.h>
SEXP setList() {
int *p_myint, i;
double *p_double;
SEXP mydouble, myint, list, list_names;
char *names[2] = {"integer", "numeric"};

// creating an integer vector:


PROTECT(myint = NEW_INTEGER(5));
p_myint = INTEGER_POINTER(myint);
// ... and a vector of real numbers:
PROTECT(mydouble = NEW_NUMERIC(5));
p_double = NUMERIC_POINTER(mydouble);
for(i = 0; i < 5; i++) {
p_double[i] = 1/(double)(i + 1);
p_myint[i] = i + 1;
}

// Creating a character string vector


// of the "names" attribute of the
// objects in out list:
PROTECT(list_names = allocVector(STRSXP,2));
for(i = 0; i < 2; i++)
SET_STRING_ELT(list_names,i,mkChar(names[i]));

// Creating a list with 2 vector elements:


PROTECT(list = allocVector(VECSXP, 2));
// attaching myint vector to list:
SET_VECTOR_ELT(list, 0, myint);
// attaching mydouble vector to list:
SET_VECTOR_ELT(list, 1, mydouble);
// and attaching the vector names:
setAttrib(list, R_NamesSymbol, list_names);
UNPROTECT(4);
return list;
}

SET_VECTOR_ELT stands for Set Vector Element

In R:
> dyn.load("useCall5.so")
> out <- .Call("setList")
> out
$integer
[1] 1 2 3 4 5
$numeric
[1] 1.00000 0.50000 0.33333 0.25000 0.20000

If you are developing an R package:


copy useC.c to myPackage/src/
The user of the package will not have to
manually load the compiled c code with dyn.load(), so:
add zzz.R file to myPackage/R
zzz.R should contain the following code:
.First.lib <-function (lib, pkg)

library.dynam("myPackage", pkg, lib)


}

If you are developing an R package (cont.),


modify the .C call: After the argument list to the C function,
add PACKAGE="compiled_file".
For example, if your compiled C code file name is
useC1.so, type:
.C("useC", b = as.integer(a), PACKAGE="useC1")
If you are using a Makefile, look at the output from
R CMD SHLIB myfile.c for flags that you may
need to incorporate in the Makefile.

Even if your R package perfectly passes an 'R CMD check':


Try to compile your C code with 'gcc -pedantic -Wall'
(you should get only warnings that you have reasons
not to eliminate)
check the R code with 'R CMD check --use-gct'
(It uses 'gctorture(TRUE)' when running examples/tests,
and it's slow)

If you won't, CRAN will do that for you and


will send you back to the drawing board.

This work has been made possible by the


Statistical Genetics Working Group at the
Department of Statistics and Actuarial Science,
SFU.

Das könnte Ihnen auch gefallen