The RCurl Package

RCurl_1.95-4.tar.gz (05 March 2013)

Manual

The RCurl package is an R-interface to the libcurl library that provides HTTP facilities. This allows us to download files from Web servers, post forms, use HTTPS (the secure HTTP), use persistent connections, upload files, use binary content, handle redirects, password authentication, etc.

The primary top-level entry points are

However, access to the C-level routines is also available via the R code, and one can specify options to all of the libcurl operations to control how they are performed. Documentation about the options and commands can be found at the libcurl web site

R functions can be specified to collect text from both the response and its headers. This can be used to customize the processing of the requests and feed the results to higher-level processing (e.g. HTML parsing via the htmlTreeParse function in the XML package).

This package will be used to implement the low-level communication in the SSOAP package and other high-level packages that utilize HTTP to exchange requests and data.

Documentation

  • Paper outlining the package with some advanced examples.
  • Guide
  • Changes across releases
  • Examples of using asynchronous, multiple concurrent requests.
  • FAQ
  • Other Approaches

    httpRequest
    The httpRequest is a package on CRAN that implements a small part of HTTP directly in R using sockets.
    httpClient
    I have developed the httpClient package using R code and connections that supports additional aspects of R and HTTP, such as cookies, character escaping, and also SSL for HTTPS. I haven't released the code (favoring the approach of building on existing C code) but can make it available if anyone is interested.
    While having code in R makes it easier to understand, explore and modify, it is probably better to use existing specialized libraries like libcurl rather than doing this ourself. We gain speed and a large development community that cares about getting things right and testing them. We will explore the use of libwww

    Issues

    Using the opaque data structures of the libcurl infrastructure means that we cannot easily access the file descriptors used in the communication. This makes it somewhat more difficult to integrate these streams into an R even loop (e.g. REventLoop). We can potentially turn them into regular connections (if the internal API is made "public").

    License

    This is distributed under the BSD license in the same spirit as libcurl itself.
    Duncan Temple Lang <duncan@wald.ucdavis.edu>
    Last modified: Mon May 25 11:35:38 PDT 2009