SYNOPSIS

    package Grab_XML_rur;
    use base 'XMLTV::Grab_XML';
    sub urls_by_date( $ ) { my $pkg = shift; ... }
    sub country( $ ) { my $pkg = shift; return 'Ruritania' }
    # Maybe override a couple of other methods as described below...
    Grab_XML_rur->go();

DESCRIPTION

This module helps to write grabbers which fetch pages in \s-1XMLTV\s0 format from some website and output the data. It is not used for grabbers which scrape human-readable sites.

It consists of several class methods (package methods). The way to use it is to subclass it and override some of these.

METHODS

XMLTV::Grab_XML->date_init()

Called at the start of the program to set up Date::Manip. You might want to override this with a method that sets the timezone.

XMLTV::Grab_XML->urls_by_date()

Returns a hash mapping \s-1YYYYMMDD\s0 dates to a \s-1URL\s0 where listings for that date can be downloaded. This method is abstract, you must override it. Arguments: the command line options for --config-file and --quiet.

XMLTV::Grab_XML->xml_from_data(data)

Given page data for a particular day, turn it into \s-1XML\s0. The default implementation just returns the data unchanged, but you might override it if you need to decompress the data or patch it up.

XMLTV::Grab_XML->configure()

Configure the grabber if needed. Arguments are --config-file option (or undef) and --quiet flag (or undef). This method is not provided in the base class; if you don't provide it then attempts to --configure will give a message that configuration is not necessary.

XMLTV::Grab_XML->nextday(day)

Bump a \s-1YYYYMMDD\s0 date by one. You probably shouldn't override this.

XMLTV::Grab_XML->country()

Return the name of the country you're grabbing for, used in usage messages. Abstract.

XMLTV::Grab_XML->usage_msg()

Return a command-line usage message. This calls \*(C`country()\*(C', so you probably need to override only that method.

XMLTV::Grab_XML->get()

Given a \s-1URL\s0, fetch the content at that \s-1URL\s0. The default implementation calls XMLTV::Get_nice::get_nice() but you might want to override it if you need to do wacky things with http requests, like cookies. Note that while this method fetches a page, \*(C`xml_from_data()\*(C' does any further processing of the result to turn it into \s-1XML\s0.

XMLTV::Grab_XML->go()

The main program. Parse command line options, fetch and write data. Most of the options are fairly self-explanatory but this routine also calls the XMLTV::Memoize module to look for a --cache argument. The functions memoized are those given by the \*(C`cachables()\*(C' method.

XMLTV::Grab_XML->cachables()

Returns a list of names of functions which could reasonably be memoized between runs. This will normally be whatever function fetches the web pages - you memoize that to save on repeated downloads. A subclass might want to add things to this list if it has its own way of fetching web pages.

XMLTV::Grab_XML->remove_early_stop_times()

Checks each stop time and removes it if it's before the start time. Argument: the \s-1XML\s0 to correct Returns: the corrected \s-1XML\s0

AUTHOR

RELATED TO XMLTV::Grab_XML…

perl\|(1), \s-1XMLTV\s0\|(3).