An XML package for the S language
Last Release:
2.6-0 (Mon Aug 3 13:50:47 PDT 2009)
Note: In version 2.4-0, there is a new approach to garbage collecting
internal/C-level nodes and documents returned from, e.g.
xmlParse(), getNodeSet(), xpathApply(), newXMLNode().
This endeavors to avoid freeing a document when there is
an R variable refering to one of its nodes, and to garbage
collect a document when all nodes are unreferenced.
This has been tested and appears to work, however there may be
some cases that we have not encountered. So if you encounter problems,
please send me email.
This package provides facilities for the S language
to
- parse XML files, URLs and strings,
using either the DOM (Document Object Model)/tree-based
approach, or the event-driven SAX (Simple API for XML)
mechanism;
- parse HTML documents,
- perform XPath queries on a document,
- generate XML content to buffers, files, URLs,
and internal XML trees;
- read DTDs as S objects.
It is an interface to the libxml2 library.
It can be combined with the RCurl package
for parsing documents that require more involved HTTP requests
to fetch the document.
Download
The source for the S package can
be downloaded as XML_2.6-0.tar.gz.
There is also a Windows version available
from the Omegahat repository.
Use
install.packages("XML", repos = "http://www.omegahat.org/R")
Documentation
-
- A short overview: HTML, PDF
-
-
- A brief introduction to parsing XML in R: HTML, PDF
-
-
- A reasonably detailed overview
of the package and what we might use XML for.
-
-
- A manual in
and a quick guide to the package (PDF).
-
-
- A short overview
of the package.
-
-
- Brief and incomplete Notes on generating XML
within S
-
-
- FAQ for the package.
-
-
- Changes to the packages (by release).
-
Duncan Temple Lang
<duncan@wald.ucdavis.edu>
Last modified: Wed Apr 1 10:28:51 PDT 2009