SQL::Statement::Syntax: Documentation of sql::statement's sql syntax

SYNOPSIS

See SQL::Statement for usage.

DESCRIPTION

The SQL::Statement module can be used either from a \s-1DBI\s0 driver like \s-1DBD::CSV\s0 or directly. The syntax below applies to both situations. In the case of DBDs, each \s-1DBD\s0 can implement its own sub-dialect so be sure to check the \s-1DBD\s0 documentation also.

SQL::Statement is meant primarly as a base class for \s-1DBD\s0 drivers and as such concentrates on a small but useful subset of \s-1SQL\s0. It does *not* in any way pretend to be a complete \s-1SQL\s0 parser for all dialects of \s-1SQL\s0. The module will continue to add new supported syntax, and users may also extend the syntax (see \*(L"#Extending the \s-1SQL\s0 syntax\*(R").

USAGE

Default Supported \s-1SQL\s0 syntax - Summary

\s-1SQL\s0 Statements

   CALL <function>
   CREATE [TEMP] TABLE <table> <column_def_clause>
   CREATE [TEMP] TABLE <table> AS <select statement>
   CREATE [TEMP] TABLE <table> AS IMPORT()
   CREATE FUNCTION <user_defined_function> [ NAME <perl_subroutine> ]
   CREATE KEYWORD  <user_defined_keyword>  [ NAME <perl_subroutine> ]
   CREATE OPERATOR <user_defined_operator> [ NAME <perl_subroutine> ]
   CREATE TYPE     <user_defined_type>     [ NAME <perl_subroutine> ]
   DELETE FROM <table> [<where_clause>]
   DROP TABLE [IF EXISTS] <table>
   DROP FUNCTION <function>
   DROP KEYWORD  <keyword>
   DROP OPERATOR <operator>
   DROP TYPE     <type>
   INSERT [INTO] <table> [<column_list>] VALUES <value_list>
   LOAD <user_defined_functions_module>
   SELECT <function>
   SELECT <select_clause>
          <from_clause>
          [<where_clause>]
          [ ORDER BY ocol1 [ASC|DESC], ... ocolN [ASC|DESC]] ]
          [ GROUP BY gcol1 [, ... gcolN] ]
          [ LIMIT [start,] length ]
   UPDATE <table> SET <set_clause> [<where_clause>]

Explicit Join Qualifiers

NATURAL, INNER, OUTER, LEFT, RIGHT, FULL

Built-in Functions

* Aggregate : MIN, MAX, AVG, SUM, COUNT * Date/Time : CURRENT_DATE, CURDATE, CURRENT_TIME, CURTIME, CURRENT_TIMESTAMP, NOW, UNIX_TIMESTAMP * String : ASCII, CHAR, BIT_LENGTH, CHARACTER_LENGTH, CHAR_LENGTH, COALESCE, NVL, IFNULL, CONV, CONCAT, DECODE, HEX, OCT, BIN, INSERT, LEFT, RIGHT, LOCATE, POSITION, LOWER, UPPER, LCASE, UCASE, LTRIM, RTRIM, OCTET_LENGTH, REGEX, REPEAT, REPLACE, SOUNDEX, SPACE, SUBSTITUTE, SUBSTRING, SUBSTR, TRANSLATE, TRIM, UNHEX * Numeric : ABS, CEILING, CEIL, FLOOR, ROUND, EXP, LOG, LN, LOG10, MOD, POWER, RAND, SIGN, SQRT, TRUNCATE, TRUNC * Trig : ACOS, ACOSEC, ACOSECH, ACOSH, ACOT, ACOTAN, ACOTANH, ACOTH, ACSC, ACSCH, ASEC, ASECH, ASIN, ASINH, ATAN, ATAN2, ATANH, COS, COSEC, COSECH, COSH, COT, COTAN, COTANH, COTH, CSC, CSCH, DEG2DEG, DEG2GRAD, DEG2RAD, DEGREES, GRAD2DEG, GRAD2GRAD, GRAD2RAD, PI, RAD2DEG, RAD2GRAD, RAD2RAD, RADIANS, SEC, SECH, SIN, SINH, TAN, TANH * System : DBNAME, USERNAME, USER

Special Utility Functions

* IMPORT - imports a table from an external RDBMS or perl structure * RUN - prepares and executes statements in a file of SQL statements

Operators and Predicates

= , <> , < , > , <= , >= , IS [NOT] (NULL|TRUE|FALSE) , LIKE , CLIKE , IN , BETWEEN

Identifiers and Aliases

* regular identifiers are case insensitive (though see note on table names) * delimited identifiers (inside double quotes) are case sensitive * column and table aliases are supported

Concatenation

* use either ANSI SQL || or the CONCAT() function * e.g. these are the same: {foo || bar} {CONCAT(foo,bar)}

Comments

* comments must occur before or after statements, cannot be embedded * SQL-style single line -- and C-style multi-line /* */ comments are supported

NULLs

* currently NULLs and empty strings are identical in non-ANSI dialect. * use {col IS NULL} to find NULLs, not {col=''} (though both may work depending on dialect)

See below for further details.

Syntax - Details

\s-1CREATE\s0 \s-1TABLE\s0

Creates permanent and in-memory tables.

CREATE [TEMP] TABLE <table_name> ( <column_definitions> ) CREATE [TEMP] TABLE <table_name> AS <select statement> CREATE [TEMP] TABLE <table_name> AS IMPORT()

Column definitions are standard \s-1SQL\s0 column names, types, and constraints, see \*(L"Column Definitions\*(R".

# create a permanent table # $dbh->do("CREATE TABLE qux (id INT PRIMARY KEY,word VARCHAR(30))");

The \*(L"\s-1AS\s0 \s-1SELECT\s0\*(R" clause creates and populates the new table using the data and column structure specified in the select statement.

# create and populate a table from a query to two other tables # $dbh->do("CREATE TABLE qux AS SELECT id,word FROM foo NATURAL JOIN bar");

If the optional keyword \s-1TEMP\s0 (or its synonym \s-1TEMPORARY\s0) is used, the table will be an in-memory table, available for the life of the current database handle or until a \s-1DROP\s0 \s-1TABLE\s0 command is issued.

# create a temporary table # $dbh->do("CREATE TEMP TABLE qux (id INT PRIMARY KEY,word VARCHAR(30))");

\s-1TEMP\s0 tables can be modified with \s-1SQL\s0 commands but the updates are not automatically reflected back to any permanent tables they may be associated with. To save a \s-1TEMP\s0 table - just use an \s-1AS\s0 \s-1SELECT\s0 clause:

$dbh = DBI->connect( 'dbi:CSV:' ); $dbh->do("CREATE TEMP TABLE qux_temp AS (id INT, word VARCHAR(30))"); # # ... modify qux_temp with INSERT, UPDATE, DELETE statements, then save it # $dbh->do("CREATE TABLE qux_permanent AS SELECT * FROM qux_temp");

Tables, both temporary and permanent may also be created directly from perl arrayrefs and from heterogeneous queries to any \s-1DBI\s0 accessible data source, see the \s-1IMPORT\s0() function.

CREATE [ {LOCAL|GLOBAL} TEMPORARY ] TABLE $table ( $col_1 $col_type1 $col_constraints1, ..., $col_N $col_typeN $col_constraintsN, ) [ ON COMMIT {DELETE|PRESERVE} ROWS ]

* col_type must be a valid data type as defined in the "valid_data_types" section of the dialect file for the current dialect

* col_constraints may be "PRIMARY KEY" or one or both of "UNIQUE" and/or "NOT NULL"

* IMPORTANT NOTE: temporary tables, data types and column constraints are checked for syntax violations but are currently otherwise *IGNORED* -- they are recognized by the parser, but not by the execution engine

* The following valid ANSI SQL92 options are not currently supported: table constraints, named constraints, check constraints, reference constraints, constraint attributes, collations, default clauses, domain names as data types

\s-1DROP\s0 \s-1TABLE\s0

DROP TABLE $table [ RESTRICT | CASCADE ]

* IMPORTANT NOTE: drop behavior (cascade or restrict) is checked for valid syntax but is otherwise *IGNORED* -- it is recognized by the parser, but not by the execution engine

\s-1INSERT\s0 \s-1INTO\s0

INSERT INTO $table [ ( $col1, ..., $colN ) ] VALUES ( $val1, ... $valN )

* default values are not currently supported * inserting from a subquery is not currently supported

\s-1DELETE\s0 \s-1FROM\s0

DELETE FROM $table [ WHERE search_condition ]

* see "search_condition" below

\s-1UPDATE\s0

UPDATE $table SET $col1 = $val1, ... $colN = $valN [ WHERE search_condition ]

* default values are not currently supported * see "search_condition" below

\s-1SELECT\s0

SELECT select_clause FROM from_clause [ WHERE search_condition ] [ ORDER BY $ocol1 [ASC|DESC], ... $ocolN [ASC|DESC] ] [ LIMIT [start,] length ]

* select clause ::= [DISTINCT|ALL] * | [DISTINCT|ALL] col1 [,col2, ... colN] | set_function1 [,set_function2, ... set_functionN]

* set function ::= COUNT ( [ALL] * ) | COUNT | MIN | MAX | AVG | SUM ( [DISTINCT|ALL] col_name )

* from clause ::= table1 [, table2, ... tableN] | table1 NATURAL [join_type] JOIN table2 | table1 [join_type] table2 USING (col1,col2, ... colN) | table1 [join_type] JOIN table2 ON table1.colA = table2.colB

* join type ::= INNER | [OUTER] LEFT | RIGHT | FULL

* if join_type is not specified, INNER is the default * if DISTINCT or ALL is not specified, ALL is the default * if start position is omitted from LIMIT clause, position 0 is the default * ON clauses may only contain equal comparisons and AND combiners * self-joins are not currently supported * if implicit joins are used, the WHERE clause must contain an equijoin condition for each table * multiple ANSI joins are not supported; use implicit joins for these * this also means that combinations of INNER and non-INNER joins are not supported

\s-1SEARCH\s0 \s-1CONDITION\s0

[NOT] $val1 $op1 $val1 [ ... AND|OR $valN $opN $valN ]

\s-1OPERATORS\s0

$op = | <> | < | > | <= | >= | IS [NOT] NULL | IS [NOT] TRUE | IS [NOT] FALSE | LIKE | CLIKE | BETWEEN | IN

The "CLIKE" operator works exactly the same as the "LIKE" operator, but is case insensitive. For example:

WHERE foo LIKE 'bar%' # succeeds if foo is "barbaz" # fails if foo is "BARBAZ" or "Barbaz"

WHERE foo CLIKE 'bar%' # succeeds for "barbaz", "Barbaz", and "BARBAZ"

BUILT-IN \s-1AND\s0 USER-DEFINED \s-1FUNCTIONS\s0

There are many built-in functions and you can also create your own new functions from perl subroutines. See SQL::Statement::Functions for documentation of functions.

Identifiers (table & column names)

Regular identifiers (table and column names *without* quotes around them) are case \s-1INSENSITIVE\s0 so column foo, fOo, \s-1FOO\s0 all refer to the same column. Internally they are used in their lower case representation, so do not rely on SQL::Statement retaining your case.

Delimited identifiers (table and column names *with* quotes around them) are case \s-1SENSITIVE\s0 so column \*(L"foo\*(R", \*(L"fOo\*(R", \*(L"\s-1FOO\s0\*(R" each refer to different columns.

A delimited identifier is *never* equal to a regular identifer (so \*(L"foo\*(R" and foo are two different columns). But don't do that :-).

Remember thought that, in \s-1DBD::CSV\s0 if table names are used directly as file names, the case sensitivity depends on the \s-1OS\s0 e.g. on Windows files named foo, \s-1FOO\s0, and fOo are the same as each other while on Unix they are different.

Special Utility \s-1SQL\s0 Functions

\s-1IMPORT\s0()

Imports the data and structure of a table from an external data source into a permanent or temporary table.

$dbh->do("CREATE TABLE qux AS IMPORT(?)",{},$oracle_sth);

$dbh->do("CREATE TABLE qux AS IMPORT(?)",{},$AoA);

$dbh->do("CREATE TABLE qux AS IMPORT(?)",{},$AoH);

\s-1IMPORT\s0() can also be used anywhere that table_names can:

$sth=$dbh->prepare(" SELECT * FROM IMPORT(?) AS T1 NATURAL JOIN IMPORT(?) AS T2 WHERE T1.id ... "); $sth->execute( $pg_sth, $mysql_sth );

The \s-1IMPORT\s0() function imports the data and structure of a table from an external data source. The \s-1IMPORT\s0() function is always used with a placeholder parameter which may be 1) a prepared and executed statement handle for any \s-1DBI\s0 accessible data source; or 2) an AoA whose first row is column names and whose succeeding rows are data 3) an AoH.

The \s-1IMPORT\s0() function may be used in the \s-1AS\s0 clause of a \s-1CREATE\s0 statement, and in the \s-1FROM\s0 clause of any statement. When used in a \s-1FROM\s0 clause, it should be used with a column alias e.g. \s-1SELECT\s0 * \s-1FROM\s0 \s-1IMPORT\s0(?) \s-1AS\s0 TableA \s-1WHERE\s0 ...

You can also write your own \s-1IMPORT\s0() functions to treat anything as a data source. See User-Defined Function in SQL::Statement::Functions.

Examples:

# create a CSV file from an Oracle query # $dbh = DBI->connect('dbi:CSV:'); $oracle_sth = $oracle_dbh->prepare($any_oracle_query); $oracle_sth->execute(@params); $dbh->do("CREATE TABLE qux AS IMPORT(?)",{},$oracle_sth);

# create an in-memory table from an AoA # $dbh = DBI->connect( 'dbi:File:' ); $arrayref = [['id','word'],[1,'foo'],[2,'bar'],]; $dbh->do("CREATE TEMP TABLE qux AS IMPORT(?)",{},$arrayref);

# query a join of a PostgreSQL table and a MySQL table # $dbh = DBI->connect( 'dbi:File:' ); $pg_dbh = DBI->connect( ... DBD::pg connect params ); $mysql_dbh = DBI->connect( ... DBD::mysql connect params ); $pg_sth = $pg_dbh->prepare( ... any pg query ); $pg_sth = $pg_dbh->prepare( ... any mysql query ); # $sth=$dbh->prepare(" SELECT * FROM IMPORT(?) AS T1 NATURAL JOIN IMPORT(?) AS T2 "); $sth->execute( $pg_sth, $mysql_sth );

\s-1RUN\s0()

Run \s-1SQL\s0 statements from a user supplied file. Please Note: this function is experimental, please let me know if you have problems.

RUN( sql_file )

If the file contains non-SELECT statements such as \s-1CREATE\s0 and \s-1INSERT\s0, use the \s-1RUN\s0() function with $dbh->do(). For example, this prepares and executes all of the \s-1SQL\s0 statements in a file called \*(L"populate.sql\*(R":

$dbh->do(" CALL RUN( 'populate.sql') ");

If the file contains \s-1SELECT\s0 statements, the \s-1RUN\s0() function may be used anywhere a table name may be used, for example, if you have a file called \*(L"query.sql\*(R" containing \*(L"\s-1SELECT\s0 * \s-1FROM\s0 Employee\*(R", then these two lines are exactly the same:

my $sth = $dbh->prepare(" SELECT * FROM Employee ");

my $sth = $dbh->prepare(" SELECT * FROM RUN( 'query.sql' ) ");

If the file contains a statement with placeholders, the values for the placehoders can be passed in the call to $sth->execute() as normal. If the query.sql file contains \*(L"\s-1SELECT\s0 id,name \s-1FROM\s0 x \s-1WHERE\s0 id=?\*(R", then these two are the same:

my $sth = $dbh->prepare(" SELECT id,name FROM x WHERE id=?"); $sth->execute(64);

my $sth = $dbh->prepare(" SELECT * FROM RUN( 'query.sql' ) "); $sth->execute(64);

Note This function assumes that the \s-1SQL\s0 statements in the file are separated by a semi-colon+newline combination (/;\n/). If you wish to use different separators or import \s-1SQL\s0 from a different source, just override the \s-1RUN\s0() function with your own user-defined-function.

Further details

Integers
Reals: Syntax obvious
Strings: Surrounded by either single quotes; some characters need to be escaped with a backslash, in particular the backslash itself (\\), the \s-1NUL\s0 byte (\0), Line feeds (\n), Carriage return (\r), and the quotes (\'). Note: Quoting \*(L"Strings\*(R" using double quotes are recognized as quoted identifiers (column or table names).
Parameters: Parameters represent scalar values, like Integers, Reals and Strings do. However, their values are read inside Execute() and not inside Prepare(). Parameters are represented by question marks (?).
Identifiers: Identifiers are table or column names. Syntactically they consist of alphabetic characters, followed by an arbitrary number of alphanumeric characters. Identifiers like \s-1SELECT\s0, \s-1INSERT\s0, \s-1INTO\s0, \s-1ORDER\s0, \s-1BY\s0, \s-1WHERE\s0, ... are forbidden and reserved for other tokens. Identifiers are always compared case-insensitively, i.e. \*(C`select foo from bar\*(C' will be evaluated the same as \*(C`SELECT FOO FROM BAR\*(C' (\*(C`FOO\*(C' will be evaluated as \*(C`foo\*(C', similar for \*(C`BAR\*(C'). Since SQL::Statement is internally using lower cased identifiers (unquoted), everytime a wildcard is used, the delivered names of the identifiers are lower cased.

Extending SQL syntax using SQL

The Supported \s-1SQL\s0 syntax shown above is the default for SQL::Statement but it can be extended (or contracted) either on-the-fly or on a permanent basis. In other words, you can modify the \s-1SQL\s0 syntax accepted as valid by the parser and accepted as executable by the executer. There are two methods for extending the syntax - 1) with \s-1SQL\s0 commands that can be issued directly in SQL::Statement or form a \s-1DBD\s0 or 2) by subclassing SQL::Parser.

The following \s-1SQL\s0 commands modify the default \s-1SQL\s0 syntax:

CREATE/DROP FUNCTION CREATE/DROP KEYWORD CREATE/DROP TYPE CREATE/DROP OPERATOR

A simple example would be a situation in which you have a table named '\s-1TABLE\s0'. Since table is an \s-1ANSI\s0 reserved key word, by default SQL::Statement will produce an error when you attempt to create or access it. You could put the table name inside double quotes since quoted identifiers can validly be reserved words, or you could rename the table. If neither of those are options, you would do this:

DROP KEYWORD table

Once that statement is issued, the parser will no longer object to 'table' as a table name. Careful though, if you drop too many keywords you may confuse the parser, especially keywords like \s-1FROM\s0 and \s-1WHERE\s0 that are central to parsing the statement.

In the reverse situation, suppose you want to parse some \s-1SQL\s0 that defines a column as type \s-1BIG_BLOB\s0. Since '\s-1BIG_BLOB\s0' isn't a recognized \s-1ANSI\s0 data type, an error will be produced by default. To make the parser treat it as a valid data type, you do this:

CREATE TYPE big_blob

Keywords and types are case-insensitive.

Suppose you are working with some \s-1SQL\s0 that contains the cosh() function (an Oracle function for hyperbolic cosine, whatever that is :-). The cosh() function is not currently implemented in SQL::Statement so the parser would die with an error. But you can easily trick the parser into accepting the function:

CREATE FUNCTION cosh

Once the parser has read that \s-1CREATE\s0 \s-1FUNCTION\s0 statement, it will no longer object to the use of the cosh() function in \s-1SQL\s0 statements.

If your only interest is in parsing \s-1SQL\s0 statements, then \*(C`CREATE FUNCTION cosh\*(C' is sufficient. But if you actually want to be able to use the cosh() function in executable statements, you need to supply a perl subroutine that performs the cosh() function:

CREATE FUNCTION cosh AS perl_subroutine_name

The subroutine name can refer to a subroutine in your current script, or to a subroutine in any available package. See SQL::Statement::Functions for details of how to create and load functions.

Functions can be used as predicates in search clauses, for example:

SELECT * FROM x WHERE c1=7 AND SOUNDEX(c3,'foo') AND c8='bar'

In the \s-1SQL\s0 above, the \*(C`SOUNDEX()\*(C' function full predicate - it plays the same role as \*(C`c1=7 or c8='bar'\*(C'.

Functions can also serve as predicate operators. An operator, unlike a full predicate, has something on the left and right sides. An equal sign is an operator, so is \s-1LIKE\s0. If you really want to you can get the parser to not accept \s-1LIKE\s0 as an operator with

DROP OPERATOR like

Or, you can invent your own operator. Suppose you have an operator \*(C`REVERSE_OF\*(C' that is true if the string on its left side when reversed is equal to the string on the right side:

CREATE OPERATOR reverse_of SELECT * FROM x WHERE c1=7 AND c3 REVERSE_OF 'foo'

The operator could just as well have been written as a function:

CREATE FUNCTION reverse_of SELECT * FROM x WHERE c1=7 AND REVERSE_OF(c3,'foo')

Like functions, if you want to actually execute a user-defined operator as distinct from just parsing it, you need to assign the operator to a perl subroutine. This is done exactly like assigning functions:

CREATE OPERATOR reverse_of AS perl_subroutine_name

Extending SQL syntax using subclasses

In addition to using the \s-1SQL\s0 shown above to modify the parser's behavior, you can also extend the \s-1SQL\s0 syntax by subclassing SQL::Parser. See SQL::Parser for details.

AUTHOR & COPYRIGHT

This document may be freely modified and distributed under the same terms as Perl itself.

SQL::Statement::Syntax (3pm)