SYNOPSIS

High-level interface:

 my $grddl = XML::GRDDL->new;
 my $model = $grddl->data($xmldoc, $baseuri);
 # $model is an RDF::Trine::Model

Low-level interface:

my $grddl = XML::GRDDL->new; my @transformations = $grddl->discover($xmldoc, $baseuri); foreach my $t (@transformations) { # $t is an XML::GRDDL::Transformation my ($output, $mediatype) = $t->transform($xmldoc); # $output is a string of type $mediatype. }

DESCRIPTION

\s-1GRDDL\s0 is a W3C Recommendation for extracting \s-1RDF\s0 data from arbitrary \s-1XML\s0 and \s-1XHTML\s0 via a transformation, typically written in \s-1XSLT\s0. See <http://www.w3.org/TR/grddl/> for more details.

This module implements \s-1GRDDL\s0 in Perl. It offers both a low level interface, allowing you to generate a list of transformations associated with the document being processed, and thus the ability to selectively run the transformation; and a high-level interface where a single \s-1RDF\s0 model is returned representing the union of the \s-1RDF\s0 graphs generated by applying all available transformations.

Constructor

The constructor accepts no parameters and returns an \s-1XML::GRDDL\s0 object.

Methods

Processes the document to discover the transformations associated with it. $xml is the raw \s-1XML\s0 source of the document, or an XML::LibXML::Document object. ($xml cannot be \*(L"tag soup\*(R" \s-1HTML\s0, though you should be able to use HTML::HTML5::Parser to parse tag soup into an XML::LibXML::Document.) $base is the base \s-1URI\s0 for resolving relative references. Returns a list of XML::GRDDL::Transformation objects. Options include:

  • force_rel - boolean; interpret \s-1XHTML\s0 rel=\*(L"transformation\*(R" even in the absence of the \s-1GRDDL\s0 profile.

  • strings - boolean; return a list of plain strings instead of blessed objects.

Processes the document, discovers the transformations associated with it, applies the transformations and merges the results into a single \s-1RDF\s0 model. $xml and $base are as per \*(C`discover\*(C'. Returns an RDF::Trine::Model containing the data. Statement contexts (a.k.a. named graphs / quads) are used to distinguish between data from the result of each transformation. Options include:

  • force_rel - boolean; interpret \s-1XHTML\s0 rel=\*(L"transformation\*(R" even in the absence of the \s-1GRDDL\s0 profile.

  • metadata - boolean; include provenance information in the default graph (a.k.a. nil context).

Get/set the user agent used for \s-1HTTP\s0 requests. $ua, if supplied, must be an LWP::UserAgent.

Constants

These constants may be exported upon request.

FEATURES

\s-1XML::GRDDL\s0 supports transformations written in \s-1XSLT\s0 1.0, and in RDF-EASE.

\s-1XML::GRDDL\s0 is a good \s-1HTTP\s0 citizen: Referer headers are included in requests, and appropriate Accept headers supplied. To be an even better citizen, I recommend changing the User-Agent header to advertise the name of the application:

$grddl->ua->default_header(user_agent => 'MyApp/1.0 ');

Provenance information for \s-1GRDDL\s0 transformations is returned using the \s-1GRDDL\s0 vocabulary at http://www.w3.org/2003/g/data-view# <http://www.w3.org/2003/g/data-view#>.

Certain \s-1XHTML\s0 profiles and \s-1XML\s0 namespaces known not to contain any transformations, or to contain useless transformations are skipped. See XML::GRDDL::Namespace and XML::GRDDL::Profile for details. In particular profiles for RDFa and many Microformats are skipped, as RDF::RDFa::Parser and HTML::Microformats will typically yield far superior results.

BUGS

Please report any bugs to <http://rt.cpan.org/>.

Known limitations:

  • Recursive \s-1GRDDL\s0 doesn't work yet. That is, the profile documents and namespace documents linked to from your primary document cannot themselves rely on \s-1GRDDL\s0.

RELATED TO XML::GRDDL…

XML::GRDDL::Transformation, XML::GRDDL::Namespace, XML::GRDDL::Profile, XML::GRDDL::Transformation::RDF_EASE::Functional, XML::Saxon::XSLT2.

HTML::HTML5::Parser, RDF::RDFa::Parser, HTML::Microformats.

\s-1JSON::GRDDL\s0.

<http://www.w3.org/TR/grddl/>.

<http://www.perlrdf.org/>.

This module is derived from Swignition <http://buzzword.org.uk/swignition/>.

AUTHOR

Toby Inkster <[email protected]>.

COPYRIGHT AND LICENCE

Copyright 2008-2012 Toby Inkster

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

DISCLAIMER OF WARRANTIES

\s-1THIS\s0 \s-1PACKAGE\s0 \s-1IS\s0 \s-1PROVIDED\s0 \*(L"\s-1AS\s0 \s-1IS\s0\*(R" \s-1AND\s0 \s-1WITHOUT\s0 \s-1ANY\s0 \s-1EXPRESS\s0 \s-1OR\s0 \s-1IMPLIED\s0 \s-1WARRANTIES\s0, \s-1INCLUDING\s0, \s-1WITHOUT\s0 \s-1LIMITATION\s0, \s-1THE\s0 \s-1IMPLIED\s0 \s-1WARRANTIES\s0 \s-1OF\s0 \s-1MERCHANTIBILITY\s0 \s-1AND\s0 \s-1FITNESS\s0 \s-1FOR\s0 A \s-1PARTICULAR\s0 \s-1PURPOSE\s0.