SYNOPSIS

 use HTML::Embedded::Turtle;

 my $het = HTML::Embedded::Turtle->new($html, $base_uri);
 foreach my $graph ($het->endorsements)
 {
   my $model = $het->graph($graph);

   # $model is an RDF::Trine::Model. Do something with it.
 }

DESCRIPTION

\s-1RDF\s0 can be embedded in (X)HTML using simple <script> tags. This is described at <http://esw.w3.org/N3inHTML>. This gives you a file format that can contain multiple (optionally named) graphs. The document as a whole can \*(L"endorse\*(R" a graph by including:

<link rel="meta" href="#foo" />

Where \*(L"#foo\*(R" is a fragment identifier pointing to a graph.

<script type="text/turtle" id="foo"> ... </script>

The rel=\*(L"meta\*(R" stuff is parsed using an RDFa parser, so equivalent RDFa works too.

This module parses \s-1HTML\s0 files containing graphs like these, and allows you to access them each individually; as a union of all graphs on the page; or as a union of just the endorsed graphs.

Despite the module name, this module supports a variety of <script type>s: text/turtle, application/turtle, application/x-turtle text/plain (N-Triples), text/n3 (Notation 3), application/x-rdf+json (\s-1RDF/JSON\s0), application/json (\s-1RDF/JSON\s0), and application/rdf+xml (\s-1RDF/XML\s0).

The deprecated attribute \*(L"language\*(R" is also supported:

<script language="Turtle" id="foo"> ... </script>

Languages supported are (case insensitive): \*(L"Turtle\*(R", \*(L"NTriples\*(R", \*(L"\s-1RDFJSON\s0\*(R", \*(L"\s-1RDFXML\s0\*(R" and \*(L"Notation3\*(R".

Constructor

Create a new object. $markup is the \s-1HTML\s0 or \s-1XHTML\s0 markup to parse; $base_uri is the base \s-1URI\s0 to use for relative references. Options include:

  • markup Choose which parser to use: 'html' or 'xml'. The former chooses HTML::HTML5::Parser, which can handle tag soup; the latter chooses XML::LibXML, which cannot. Defaults to 'html'.

  • rdfa_options A set of options to be parsed to RDF::RDFa::Parser when looking for endorsements. See RDF::RDFa::Parser::Config. The default is probably sensible.

Public Methods

A union graph of all graphs found in the document, as an RDF::Trine::Model. Note that the returned model contains quads. A union graph of only the endorsed graphs, as an RDF::Trine::Model. Note that the returned model contains quads. A single graph from the page. A hashref where the keys are graph names and the values are RDF::Trine::Models. Some graph names will be URIs, and others may be blank nodes (e.g. \*(L"_:foobar\*(R"). \*(C`graphs\*(C' and \*(C`all_graphs\*(C' are aliases for each other. Like \*(C`all_graphs\*(C', but only returns endorsed graphs. Note that all endorsed graphs will have graph names that are URIs. Returns a list of URIs which are the names of endorsed graphs. Note that the presence of a \s-1URI\s0 $x in this list does not imply that \*(C`$het->graph($x)\*(C' will be defined. Returns the page \s-1DOM\s0. Returns the page \s-1URI\s0.

BUGS

Please report any bugs to <http://rt.cpan.org/>.

Please forgive me in advance for inflicting this module upon you.

RELATED TO HTML::Embedded::Turtle…

RDF::RDFa::Parser, RDF::Trine, RDF::TriN3.

<http://www.perlrdf.org/>.

AUTHOR

Toby Inkster <[email protected]>.

COPYRIGHT AND LICENSE

Copyright (C) 2010-2011, 2013 by Toby Inkster.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

DISCLAIMER OF WARRANTIES

\s-1THIS\s0 \s-1PACKAGE\s0 \s-1IS\s0 \s-1PROVIDED\s0 \*(L"\s-1AS\s0 \s-1IS\s0\*(R" \s-1AND\s0 \s-1WITHOUT\s0 \s-1ANY\s0 \s-1EXPRESS\s0 \s-1OR\s0 \s-1IMPLIED\s0 \s-1WARRANTIES\s0, \s-1INCLUDING\s0, \s-1WITHOUT\s0 \s-1LIMITATION\s0, \s-1THE\s0 \s-1IMPLIED\s0 \s-1WARRANTIES\s0 \s-1OF\s0 \s-1MERCHANTIBILITY\s0 \s-1AND\s0 \s-1FITNESS\s0 \s-1FOR\s0 A \s-1PARTICULAR\s0 \s-1PURPOSE\s0.