SYNOPSIS

vera++ [options] [file ...]

Introduction

Vera++ is a programmable tool for verification, analysis and transformation of C++ source code.

The main usage scenarios that are foreseen for Vera++ are:

\[bu]

Ensure that the source code complies with the given coding standards and conventions.

\[bu]

Provide source code metrics and statistics.

\[bu]

Perform automated transformations of the source code, which can range from pretty-printing to diagnostics to fault injection and advanced testing.

The main design idea of Vera++ is to create a generic engine that will be able to parse the C++ code and present it in the form of collections of various objects to user provided scripts that will define the concrete actions to be executed.

Currently the following object collections are provided:

\[bu]

Collection of source file names.

\[bu]

Collection of source lines for each file.

\[bu]

Collection of identified tokens in each file.

Note: It is foreseen that future versions of Vera++ will provide also the semantic view on the code.

The most important feature of Vera++ is that all activities other than code parsing are defined by scripts. This means that Vera++ is flexible and extensible.

For example, compliance with coding standards can be expressed in terms of rules, each being defined by a separate script. The scripts can access all collections listed above and perform actions related to the given rule. The user can ask to run any given script or some defined set of scripts in a single program execution.

As a simple example, a coding convention that limits the length of the source line can be implemented as a script that traverses the collection of files and the collection of source lines and checks whether each source line fits within the given limits. A report can be generated for each non-conforming line of code so that the user gets a clear information about where the problem is located.

All existing rules present their reports in the format that is compatible with regular compiler\[aq]s output, so that it is easy to integrate Vera++ with the existing build framework.

Similarly, automated transformation procedures are implemented as separate scripts that scan the above collections and produce another source files according to their algorithms. A simple example of such transformation might be a script that removes empty lines from source code.

The Tcl programming language is currently supported for scripts that run within Vera++.

Running Vera++

Vera++ needs to know where the rules and transformation scripts are located. The following rules are applied:

\[bu]

If the --root option is used, its argument is used as the name of the directory where the scripts subdirectory with scripts should be located, otherwise

\[bu]

If the VERA_ROOT environment variable is defined, it is used as the name of the directory where the scripts subdirectory with scripts should be located, otherwise

\[bu]

If the HOME environment variable is defined, then the ~/.vera++ directory is used (and it should contain the scripts subdirectory with scripts), otherwise

\[bu]

The current directory should contain the scripts subdirectory.

Options

Vera++ recognizes the following parameters:

-

(a single minus) indicates that the source code to check will be provided on the stdin.

-p --profile profilename

instructs the program to execute all rules defined in the given profile; the profile name is just a name of the file that will be found under the profiles directory, the content of this file is a Tcl script that must set a rules variable to be the list of all rules that are part of the profile. An example profile definition that groups three rules (L001, L002 and L003) might look like:

  • set rules {
        L001
        L002
        L003
    }
    
    

There is always a default profile that lists all existing rules - it is used when no profile is named explicitly.

-R --rule rulename

instructs the program to execute the given rule; note that the name of the rule should not contain the file extension of the script implementing the rule - this is added automatically, so that for example --rule my_rule means that Vera++ will find the my_rule.tcl script and will run it.

--transform transformationname

instructs the program to execute a single named transformation; the naming scheme is the same as for the --rule option.

-o --std-report filename

writes the standard (gcc-like) report to this file. A single dash - means that the standard output or the error output will be used, depending on the usage of the --warning or --error option. This option may be used several times in order to produce the reports in several locations - for example on the standard output and in a file. Default value is -.

-v --vc-report filename

writes the Visual C report to this file. A single dash - means that the standard output or the error output will be used, depending on the usage of the --warning or --error option. This option may be used several times in order to produce the reports in several locations - for example on the standard output and in a file. This report is not produced by default.

-x --xml-report filename

writes the XML report to this file. Not used by default. A single dash - means that the standard output or the error output will be used, depending on the usage of the --warning or --error option. This option may be used several times in order to produce the reports in several locations - for example on the standard output and in a file. This report is not produced by default.

-c --checkstyle-report filename

writes the checkstyle report to this file. Not used by default. A single dash - means that the standard output or the error output will be used, depending on the usage of the --warning or --error option. This option may be used several times in order to produce the reports in several locations - for example on the standard output and in a file. This report is not produced by default.

-s --show-rule

includes the name of the rule in each report line.

-d --no-duplicate

instructs the program to omit duplicated messages in the final report (the duplicates can be a result of violating the same rule many times in the same line of source code).

-w --warning

reports are marked as warning and generated on the error output.

-e --error

reports are marked as error and generated on the error output. An non zero exit code is used when one or more reports are generated.

-q --quiet

don\[aq]t display the reports. This option is best used with --summary and/or with --error.

-S --summary

displays the number of reports and the number of processed files.

--parameters filename

instructs the program to read parameter values from the given file; each parameter association should be placed in a separate line of this file. This option may be used several times.

-P --parameter parameterassociation

provides the value of the named parameter to the scripts (see the documentation of each script to see whether it recognizes any parameters); the parameter association has the form name=value.

--exclusions exclusionsfilename

instructs the program to exclude some source files from rule checks, as described in the given file; the content of this file is a Tcl script that must set a ruleExclusions array, where keys are rule names and values are lists of files to omit for the given rule. For example:

  • set ruleExclusions(L002) {
        some_file.cpp
    }
    
    set ruleExclusions(T005) {
        some_file.cpp
        some_other_file.cpp
    }
    
    

Note that the given file names are compared for exact match with the source file names that are provided as parameters to Vera++. This means that links in paths are not resolved for comparison purposes. This option may be used several times.

-i --inputs filename

the inputs are read from that file. A single dash - means that the files to check will be read from the standard input. This option may be used several times.

-r --root path

uses the given path as the vera++ root directory

--version

prints the program version information and exits.

-h --help

prints the list of recognized options and exits.

--

(a double dash) do not interpret any more arguments as options.

Arguments that are not starting with a dash - are treated as source files to check. Files starting with a dash can be checked by prefixing them with the current directory shortcut ./.

When no input file is provided either as an argument or with the --input option, the list of source file names is read from the standard input.

Examples of executing Vera++ with rules

To execute all default verification rules against the file file.cpp, run:

  • vera++ file.cpp
    
    

To execute only rule L001 (this rule ensures that there is no trailing whitespace in each source line) against the same file, run:

  • vera++ -R L001 file.cpp
    
    

To execute rule L004 (this rule checks for too long source lines) with the parameter value providing 78 as the maximum line length, run:

  • vera++ -R L004 -P max-line-length=78 file.cpp
    
    

To execute all rules from your favorite profile (assuming that the my_favorite profile definition is stored in the profiles directory) against all header files in the current filesystem subtree, run:

  • find . -name \[aq]*.h\[aq] | vera++ --profile my_favorite
    
    

Note: Vera++ collects the reports generated by each rule and prints them out sorted and after all rules were executed. If there were no problem reports, the output of the program is empty.

Note: Vera++ reports are generated on the standard output by default, making them easy to use with a pipe. The --warning and --error options are changing the output to the standard error. The options --std-report, --vc-report, --xml-report and --quiet may be used to disable the output to the standard or error output.

Examples of executing Vera++ with transformations

To execute the trim_right source code transformation (it removes the trailing whitespace that the rule L001 above complained about) on all .cpp files in the current directory run:

  • vera++ --transform trim_right *.cpp
    
    

As a result, each .cpp file will be backed up with the additional extension .bak and the files will be trimmed by removing trailing whitespace. The exact behavior is defined by the script named trim_right.tcl in the scripts/transformations directory.

Running Vera++ as a test with CMake

CMake offers the possibility to run tests that are considered to pass when they return a 0 value and to fail otherwise. Fortunately, vera++, when used with the --error option, has exactly this behavior. Creating the test is just a matter of listing the sources to check:

  • file(GLOB_RECURSE srcs
      ${CMAKE_SOURCE_DIR}/src/*.cpp
      ${CMAKE_SOURCE_DIR}/src/*.h)
    add_test(NAME VeraStyle
      COMMAND vera++
      --error
      ${srcs})
    
    

Running Vera++ during the build with CMake

Running vera++ in a test integrates quite badly with the IDEs or with CDash (http://cdash.org): the reports are hidden in the test log, and it is not easy to look at the problematic code. Moreover, a failure in the coding style is not the same as a failure in a unit or functional test, and shouldn\[aq]t appear in the same way. Another option is to run vera++ during the build and make it generate warnings that are well interpreted by the IDEs and CDash. In QtCreator for instance, it is then possible to click on the warning to go to the problematic code.

Running vera++ during the build can be done in a similar way to the previous section, by replacing the add_test() call with a add_custom_target() that will run the style check every time the custom target is built.

  • file(GLOB_RECURSE srcs
      ${CMAKE_SOURCE_DIR}/src/*.cpp
      ${CMAKE_SOURCE_DIR}/src/*.h)
    add_custom_target(VeraStyle ALL
      vera++
      --warning
      ${srcs})
    
    

For large projects, running the style check every time can be quite time consuming and uncomfortable for the developer. It is then more convenient to split the style check in several parts that can be run in parallel, and to avoid rerunning the check if the files to check have not been modified. A vera++ macro is available to do that very easily:

  • find_package(vera++)
    include(${VERA++_USE_FILE})
    add_vera_targets(*.h *.cpp
      RECURSE
      ROOT "${CMAKE_SOURCE_DIR}")
    
    

This macro adds a new style_reports target that is run every time a source file is modified. A style target is still available to force the style check. The target names can be configured with the parameters NAME and NAME_ALL. This macro is the recommended way to use vera++ with CMake.

Backward compatibility with vera++ 1.1

Vera++ is still mostly compatible with the vera++ 1.1 command line interface, but this feature is planned for removal and its usage is not recommended.

Vera++ tries to detect if the old command line style is used by searching for the old options in the arguments. If no old style option is found, vera++ uses the new command line parser.

The command line style can be forced to the old style by setting the environment variable VERA_LEGACY to on, true or 1. Any other value will force vera++ to use the new command line style.

Note: the behavior of vera++ is not backward compatible with vera++ 1.1 when no option is passed to vera++ and VERA_LEGACY is not set:

\[bu]

the reports are generated on the standard output instead of the error output;

\[bu]

a single dash - means that the source code to check is read from the standard input instead of reading the list of files to check;

\[bu]

the lack of input files makes vera++ read the standard input instead of generating an error.

Rules

F001 Source files should not use the \[aq]\r\[aq] (CR) character

As a commonly accepted practice, line breaks are denoted by a single \[aq]\n\[aq] (LF) character or by two characters "\r\n" (CRLF). A single appearance of \[aq]\r\[aq] (CR) is discouraged.

Compliance: Boost

F002 File names should be well-formed

The source file names should be well-formed in the sense of their allowed maximum length and directory depth. Directory and file names should start with alphabetic character or underscore. In addition, directory names should not contain dots and file names can have only one dot.

Recognized parameters:

  • Name                    Default   Description
    ----------------------- --------- -------------------------------------------------
    max-directory-depth     8         Maximum depth of the directory structure.
    max-dirname-length      31        Maximum length of the directory path component.
    max-filename-length     31        Maximum length of the leaf file name.
    max-path-length         100       Maximum length of the full path.
    
    

Compliance: Boost

L001 No trailing whitespace

Trailing whitespace is any whitespace character (space or tab) that is placed at the end of the source line, after other characters or alone.

The presence of trailing whitespace artificially influences some source code metrics and is therefore discouraged.

As a special case, the trailing whitespace in the otherwise empty lines is allowed provided that the amount of whitespace is identical to the indent in the previous line - this exception is more friendly with less smart editors, but can be switched off by setting non-zero value for the strict-trailing-space parameter.

Recognized parameters:

  • Name                      Default   Description
    ------------------------- --------- --------------------------------------
    strict-trailing-space     0         Strict mode for trailing whitespace.
    
    

Compliance: Inspirel

L002 Don\[aq]t use tab characters

Horizontal tabs are not consistently handled by editors and tools. Avoiding them ensures that the intended formatting of the code is preserved.

Compliance: HICPP, JSF

L003 No leading and no trailing empty lines

Leading and trailing empty lines confuse users of various tools (like head and tail) and artificially influence some source code metrics.

Compliance: Inspirel

L004 Line cannot be too long

The source code line should not exceed some reasonable length.

Recognized parameters:

  • Name                Default   Description
    ------------------- --------- -------------------------------------
    max-line-length     100       Maximum length of source code line.
    
    

Compliance: Inspirel

L005 There should not be too many consecutive empty lines

The empty lines (if any) help to introduce more "light" in the source code, but they should not be overdosed in the sense that too many consecutive empty lines make the code harder to follow.

Lines containing only whitespace are considered to be empty in this context.

Recognized parameters:

  • Name                            Default   Description
    ------------------------------- --------- --------------------------------------------
    max-consecutive-empty-lines     2         Maximum number of consecutive empty lines.
    
    

Compliance: Inspirel

L006 Source file should not be too long

The source file should not exceed a reasonable length.

Long source files can indicate an opportunity for refactoring.

Recognized parameters:

  • Name                Default   Description
    ------------------- --------- ------------------------------------
    max-file-length     2000      Maximum number of lines in a file.
    
    

Compliance: Inspirel

T001 One-line comments should not have forced continuation

The one-line comment is a comment that starts with //.

The usual intent is to let the comment continue till the end of the line, but the preprocessing rules of the language allow to actually continue the comment in the next line if line-splicing is forced with the backslash at the end of the line:

  • void foo()
    {
        // this comment is continued in the next line \
        exit(0);
    }
    
    

It is not immediately obvious what happens in this example. Moreover, the line-splicing works only if the backslash is really the last character in the line - which is error prone because any white characters that might appear after the backslash will change the meaning of the program without being visible in the code.

Compliance: Inspirel

T002 Reserved names should not be used for preprocessor macros

The C++ Standard reserves some forms of names for language implementations. One of the most frequent violations is a definition of preprocessor macro that begins with underscore followed by a capital letter or containing two consecutive underscores:

  • #define _MY_MACRO something
    #define MY__MACRO something
    
    

Even though the majority of known compilers use more obscure names for internal purposes and the above code is not likely to cause any significant problems, all such names are formally reserved and therefore should not be used.

Apart from the use of underscore in macro names, preprocessor macros should not be used to redefine language keywords:

  • #define private public
    #define const
    
    

Compliance: ISO

T003 Some keywords should be followed by a single space

Keywords from the following list:

\[bu]

case

\[bu]

class

\[bu]

delete

\[bu]

enum

\[bu]

explicit

\[bu]

extern

\[bu]

goto

\[bu]

new

\[bu]

struct

\[bu]

union

\[bu]

using

should be followed by a single space for better readability.

Compliance: Inspirel

T004 Some keywords should be immediately followed by a colon

Keywords from the following list:

\[bu]

default

\[bu]

private

\[bu]

protected

\[bu]

public

should be immediately followed by a colon, unless used in the list of base classes:

  • class A : public B, private C
    {
    public:
         A();
         ~A();
    protected:
         // ...
    private:
         // ...
    };
    
    void fun(int a)
    {
         switch (a)
         {
         // ...
         default:
              exit(0);
         }
    }
    
    

Compliance: Inspirel

T005 Keywords break and continue should be immediately followed by a

semicolon

The break and continue keywords should be immediately followed by a semicolon, with no other tokens in between:

  • while (...)
    {
         if (...)
         {
              break;
         }
         if (...)
         {
              continue;
         }
         // ...
    }
    
    

Compliance: Inspirel

T006 Keywords return and throw should be immediately followed by a

semicolon or a single space

The return and throw keywords should be immediately followed by a semicolon or a single space:

  • void fun()
    {
         if (...)
         {
              return;
         }
         // ...
    }
    
    int add(int a, int b)
    {
         return a + b;
    }
    
    

An exception to this rule is allowed for exeption specifications:

  • void fun() throw();
    
    

Compliance: Inspirel

T007 Semicolons should not be isolated by spaces or comments from

the rest of the code

The semicolon should not stand isolated by whitespace or comments from the rest of the code.

  • int a ;     // bad
    int b
    ;           // bad
    int c;      // OK
    
    

As an exception from this rule, semicolons surrounded by spaces are allowed in for loops:

  • for ( ; ; ) // OK as an exception
    {
        // ...
    }
    
    

Compliance: Inspirel

T008 Keywords catch, for, if, switch and while should be followed by

a single space

Keywords catch, for, if, switch and while should be followed by a single space and then an opening left parenthesis:

  • catch (...)
    {
         for (int i = 0; i != 10; ++i)
         {
              if (foo(i))
              {
                   while (getline(cin, line))
                   {
                        switch (i % 3)
                        {
                        case 0:
                             bar(line);
                             break;
                        // ...
                        }
                   }
              }
         }
    }
    
    

Compliance: Inspirel

T009 Comma should not be preceded by whitespace, but should be

followed by one

A comma, whether used as operator or in various lists, should not be preceded by whitespace on its left side, but should be followed by whitespace on its right side:

  • void fun(int x, int y, int z);
    int a[] = {5, 6, 7};
    class A : public B,
              public C
    {
         // ...
    };
    
    

An exception to this rule is allowed for operator,:

  • struct A {};
    void operator,(const A &left, const A &right);
    
    

Compliance: Inspirel

T010 Identifiers should not be composed of \[aq]l\[aq] and

\[aq]O\[aq] characters only

The characters \[aq]l\[aq] (which is lowercase \[aq]L\[aq]) and \[aq]O\[aq] (which is uppercase \[aq]o\[aq]) should not be the only characters used in the identifier, because this would make them visually similar to numeric literals.

Compliance: Inspirel

T011 Curly brackets from the same pair should be either in the same

line or in the same column

Corresponding curly brackets should be either in the same line or in the same column. This promotes clarity by emphasising scopes, but allows concise style of one-line definitions and empty blocks:

  • class MyException {};
    
    struct MyPair
    {
        int a;
        int b;
    };
    
    enum state { close, open };
    
    enum colors
    {
        black,
        red,
        green,
        blue,
        white
    };
    
    

Compliance: Inspirel

T012 Negation operator should not be used in its short form

The negation operator (exclamation mark) reduces readability of the code due to its terseness. Prefer explicit logical comparisons or alternative tokens for increased readability:

  • if (!cond)         // error-prone
    if (cond == false) // better
    if (not cond)      // better (alternative keyword)
    
    

Compliance: Inspirel

T013 Source files should contain the copyright notice

The copyright notice is required by man coding standards and guidelines. In some countries every written artwork has some copyright, even if implicit. Prefer explicit notice to avoid any later confusion.

This rule verifies that at least one comment in the source file contains the "copyright" word.

Compliance: Boost

T014 Source files should refer the Boost Software License

The Boost Software License should be referenced in the source code.

This rule verifies that at least one comment in the source file contains the "Boost Software License" phrase.

Note that this rule is very specific to the Boost libraries and those project that choose to use the Boost license. It is therefore not part of the default profile.

Compliance: Boost

T015 HTML links in comments and string literals should be correct

The links embedded in comments and string literals should have correct form and should reference existing files.

Compliance: Boost

T016 Calls to min/max should be protected against accidental macro

substitution

The calls to min and max functions should be protected against accidental macro substitution.

  • x = max(y, z); // wrong, vulnerable to accidental macro substitution
    
    x = (max)(y, z); // OK
    
    x = max BOOST_PREVENT_MACRO_SUBSTITUTION (y, z); // OK
    
    

Compliance: Boost

T017 Unnamed namespaces are not allowed in header files

Unnamed namespaces are not allowed in header files.

The typical use of unnamed namespace is to hide module-internal names from the outside world. Header files are physically concatenated in a single translation unit, which logically merges all namespaces with the same name. Unnamed namespaces are also merged in this process, which effectively undermines their initial purpose.

Use named namespaces in header files. Unnamed namespaces are allowed in implementation files only.

Compliance: Boost

T018 Using namespace is not allowed in header files

Using namespace directives are not allowed in header files.

The using namespace directive imports names from the given namespace and when used in a header file influences the global namespace of all the files that directly or indirectly include this header file.

It is imaginable to use the using namespace directive in a limited scope in a header file (for example in a template or inline function definition), but for the sake of consistency this is also discouraged.

Compliance: C++ Coding Standards

T019 Control structures should have complete curly-braced block of

code

Control structures managed by for, if and while constructs can be associated with a single instruction or with a complex block of code. Standardizing on the curly-braced blocks in all cases allows one to avoid common pitfalls and makes the code visually more uniform.

  • if (x) foo();     // bad style
    if (x) { foo(); } // OK
    
    if (x)
        foo();        // again bad style
    
    if (x)
    {                 // OK
        foo();
    }
    
    if (x)
        while (y)     // bad style
            foo();    // bad style
    
    if (x)
    {                 // OK
        while (y)
        {             // OK
            foo();
        }
    }
    
    for (int i = 0; i = 10; ++i);  // oops!
        cout << "Hello\n";
    
    for (int i = 0; i = 10; ++i)   // OK
    {
        cout << "Hello\n";
    }
    
    

Compliance: Inspirel

Transformations

move_includes Change prefix of #include paths

This transformation allows one to modify the prefix of file paths in #include directives.

The motivation for this transformation is to help move whole libraries from one file tree to another.

Please use this transformation as a boilerplate for your own customized version.

For example, the following file:

  • #include "boost/shared_ptr.hpp"
    #include "boost/bind.hpp"
    
    

will be transformed into:

  • #include "boom/shared_ptr.hpp"
    #include "boom/bind.hpp"
    
    

Note: The transformation is performed in place, which means that the source files are modified.

move_macros Change prefix in macros

This transformation allows one to modify the prefix of macros.

The motivation for this transformation is to help move whole libraries or source sets from one naming conventioin to another.

Please use this transformation as a boilerplate for your own customized version.

For example, the following file:

  • #define BOOST_SOME_MACRO 1
    // ...
    #ifdef BOOST_SOME_MACRO
    // ...
    #endif
    
    

will be transformed into:

  • #define BOOM_SOME_MACRO 1
    // ...
    #ifdef BOOM_SOME_MACRO
    // ...
    #endif
    
    

Note: This transformation actually does not check whether the given identifier is indeed a macro name and the prefix replacement is performed systematically on all identifiers that match.

Note: The transformation is performed in place, which means that the source files are modified.

move_namespace Change namespace name

This transformation allows one to consistently change the namespace name.

The motivation for this transformation is to help move whole libraries or source sets from one namespace to another, for example to allow the coexistence of two different version of the same library.

Please use this transformation as a boilerplate for your own customized version.

For example, the following file:

  • namespace boost
    {
    void foo();
    }
    
    void boost::foo() {/* ... */}
    
    

will be transformed into:

  • namespace boom
    {
    void foo();
    }
    
    void boom::foo() {/* ... */}
    
    

Note: This transformation actually does not check whether the given identifier is indeed a namespace name and the replacement is performed systematically on all identifiers that match. Do not use it on code that overloads namespace names for other purposes.

Note: The transformation is performed in place, which means that the source files are modified.

to_lower Change identifier naming convention from CamelCase to

standard_lowercase

This transformation allows one to modify the naming convention of all identifiers from CamelCase to standard_lowercase, as used by the standard library or Boost.

For example, the following code:

  • namespace MyTools
    {
    
    class MyClass
    {
    public:
        void myFunction();
    };
    
    }
    
    

will be transformed into this:

  • namespace my_tools
    {
    
    class my_class
    {
    public:
        void my_function();
    };
    
    }
    
    

Note: The transformation is performed in place, which means that the source files are modified.

Note: This transformation does not modify comments and string literals.

to_xml Transform C++ code into XML

This transformation generates a XML tree where nodes relate to C++ source code tokens.

For example, the following file (file.cpp):

  • #include <iostream>
    
    int main()
    {
        std::cout << "Hello World\n";
    }
    
    

will be transformed into new file named file.cpp.xml:

  • <?xml version="1.0" encoding="ISO-8859-1"?>
    <cpp-source file-name="test.cpp">
        <token name="pp_hheader" line="1" column="0">#include <iostream></token>
        <token name="newline" line="1" column="19">![CDATA[
    ]]</token>
        <token name="newline" line="2" column="0">![CDATA[
    ]]</token>
        <token name="int" line="3" column="0">int</token>
        <token name="space" line="3" column="3"> </token>
        <token name="identifier" line="3" column="4">main</token>
        <token name="leftparen" line="3" column="8">(</token>
        <token name="rightparen" line="3" column="9">)</token>
        <token name="newline" line="3" column="10">![CDATA[
    ]]</token>
        <token name="leftbrace" line="4" column="0">{</token>
        <token name="newline" line="4" column="1">![CDATA[
    ]]</token>
        <token name="space" line="5" column="0">    </token>
        <token name="identifier" line="5" column="4">std</token>
        <token name="colon_colon" line="5" column="7">::</token>
        <token name="identifier" line="5" column="9">cout</token>
        <token name="space" line="5" column="13"> </token>
        <token name="shiftleft" line="5" column="14"><<</token>
        <token name="space" line="5" column="16"> </token>
        <token name="stringlit" line="5" column="17">"Hello World\n"</token>
        <token name="semicolon" line="5" column="32">;</token>
        <token name="newline" line="5" column="33">![CDATA[
    ]]</token>
        <token name="rightbrace" line="6" column="0">}</token>
        <token name="newline" line="6" column="1">![CDATA[
    ]]</token>
        <token name="eof" line="7" column="0"></token>
    </cpp-source>
    
    

Note: If the source code does not use line splicing, then concatenation of all XML node values is equivalent to the original C++ code.

to_xml2 Transform C++ code into XML (another variant)

This transformation generates a XML tree where nodes relate to C++ source code tokens.

The difference between this version and the one named to_xml is that here nodes have names related to token types, which can make it easier for some further XML transformations.

For example, the following file (file.cpp):

  • #include <iostream>
    
    int main()
    {
        std::cout << "Hello World\n";
    }
    
    

will be transformed into new file named file.cpp.xml:

  • <?xml version="1.0" encoding="ISO-8859-1"?>
    <cpp-source file-name="test.cpp">
        <pp_hheader line="1" column="0">#include <iostream></pp_hheader>
        <newline line="1" column="19">![CDATA[
    ]]</newline>
        <newline line="2" column="0">![CDATA[
    ]]</newline>
        <int line="3" column="0">int</int>
        <space line="3" column="3"> </space>
        <identifier line="3" column="4">main</identifier>
        <leftparen line="3" column="8">(</leftparen>
        <rightparen line="3" column="9">)</rightparen>
        <newline line="3" column="10">![CDATA[
    ]]</newline>
        <leftbrace line="4" column="0">{</leftbrace>
        <newline line="4" column="1">![CDATA[
    ]]</newline>
        <space line="5" column="0">    </space>
        <identifier line="5" column="4">std</identifier>
        <colon_colon line="5" column="7">::</colon_colon>
        <identifier line="5" column="9">cout</identifier>
        <space line="5" column="13"> </space>
        <shiftleft line="5" column="14"><<</shiftleft>
        <space line="5" column="16"> </space>
        <stringlit line="5" column="17">"Hello World\n"</stringlit>
        <semicolon line="5" column="32">;</semicolon>
        <newline line="5" column="33">![CDATA[
    ]]</newline>
        <rightbrace line="6" column="0">}</rightbrace>
        <newline line="6" column="1">![CDATA[
    ]]</newline>
        <eof line="7" column="0"></eof>
    </cpp-source>
    
    

Note: If the source code does not use line splicing, then concatenation of all XML node values is equivalent to the original C++ code.

trim_right Remove trailing white space

This transformation removes the trailing whitespace from each line of code.

It can be treated as a quick remedy for problems reported by rule L001.

Note: The transformation is performed in place, which means that the source files are modified.

Script API

The scripts (rules and transformations) are written in Tcl and are executed by the embedded interpreter that has access to relevant state of the program. A set of commands is provided to enable easy read-only operation on the information that was gathered by parsing given source files.

The following Tcl commands are provided:

\[bu]

getSourceFileNames - returns the list of file names that were provided to Vera++ as program parameters.

\[bu]

getLineCount fileName - returns the number of lines in the given source file.

\[bu]

getAllLines fileName - returns the list of lines, in their natural order, that form a give source file.

\[bu]

getLine fileName lineNumber - returns the selected line; line numbers are counted from 1.

\[bu]

getTokens fileName fromLine fromColumn toLine toColumn filter - returns the list of tokens, in their natural order, from the given source file and that match the given selection criteria.

The meaning of arguments for selecting tokens is:

\[bu]

fromLine - the lowest line number (counted from 1), inclusive

\[bu]

fromColumn - the lowest column number (counted from 0), inclusive

\[bu]

toLine - the highest line number, inclusive; -1 means that the selected range spans to the end of the file

\[bu]

toColumn - the highest column number, exclusive; -1 means that the selected range spans to the end of the line defined by toLine.

\[bu]

filter - the list of selected token types, the recognized token types are listed below; if this list is empty, then all token types are allowed.

The getTokens command returns a list of lists - the nested lists have the following elements:

\[bu]

value - the literal text of the token

\[bu]

lineNumber - the line number (from 1) where the token appears

\[bu]

columnNumber - the column number (from 0) where the token appears

\[bu]

name - the name or type of the token; see below for the list of recognized token types

\[bu]

getParameter name defaultValue - returns the value of the given parameter or the provided default value if no such parameter is defined.

\[bu]

report fileName lineNumber message - registers a report for the given file and line; this report is printed at the end of the program execution, sorted by file and line number. Use this command to generate output that is compatible with the warning/error output format of popular compilers.

Examples:

To process all lines from all source files, use the following code pattern:

  • foreach fileName [getSourceFileNames] {
        foreach line [getAllLines $fileName] {
            # ...
        }
    }
    
    

To process all tokens from all source files, use:

  • foreach fileName [getSourceFileNames] {
        foreach token [getTokens $fileName 1 0 -1 -1 {}] {
            set tokenValue [lindex $token 0]
            set lineNumber [lindex $token 1]
            set columnNumber [lindex $token 2]
            set tokenType [lindex $token 3]
            # ...
        }
    }
    
    

To process only curly braces from the given source file, use:

  • foreach token [getTokens $fileName 1 0 -1 -1 {leftbrace rightbrace}] {
        # ...
    }
    
    

The complete rule script for verifying that the lines are no longer than some limit (the limit can be provided as a parameter, but the default value is defined in by the script itself):

  • # Line cannot be too long
    
    set maxLength [getParameter "max-line-length" 100]
    
    foreach f [getSourceFileNames] {
        set lineNumber 1
        foreach line [getAllLines $f] {
            if {[string length $line] > $maxLength} {
                report $f $lineNumber "line is longer than ${maxLength} characters"
            }
            incr lineNumber
        }
    }
    
    

The above script is actually the implementation of rule L004.

Notes about line splicing

As required by the C++ ISO standard, the line splicing (with the backslash at the end of the line) is performed before tokenizing. This means that the lists of tokens might not strictly fit the list of lines.

Due to the internal mechanisms of the parser, the line splicing freezes the line counter and forces the column counter to continue until the last line in the spliced block. This means that there might be physical non-empty lines that apparently don\[aq]t have any tokens, as well as tokens that have column numbers not matching the physical source line lengths.

Recognized token types

The following token types are recognized by the parser and can be used for filter selection in the getTokens command (some of these token types are related to compiler extensions):

  • and
    andand
    andassign
    any
    arrow
    arrowstar
    asm
    assign
    auto
    bool
    break
    case
    catch
    ccomment
    char
    charlit
    class
    colon
    colon_colon
    comma
    compl
    const
    constcast
    continue
    contline
    cppcomment
    decimalint
    default
    delete
    divide
    divideassign
    do
    dot
    dotstar
    double
    dynamiccast
    ellipsis
    else
    enum
    eof
    eoi
    equal
    explicit
    export
    extern
    false
    float
    floatlit
    for
    friend
    goto
    greater
    greaterequal
    hexaint
    identifier
    if
    inline
    int
    intlit
    leftbrace
    leftbracket
    leftparen
    less
    lessequal
    long
    longintlit
    minus
    minusassign
    minusminus
    msext_asm
    msext_based
    msext_cdecl
    msext_declspec
    msext_endregion
    msext_except
    msext_fastcall
    msext_finally
    msext_inline
    msext_int16
    msext_int32
    msext_int64
    msext_int8
    msext_leave
    msext_region
    msext_stdcall
    msext_try
    mutable
    namespace
    new
    newline
    not
    notequal
    octalint
    operator
    or
    orassign
    oror
    percent
    percentassign
    plus
    plusassign
    plusplus
    pound
    pound_pound
    pp_define
    pp_elif
    pp_else
    pp_endif
    pp_error
    pp_hheader
    pp_if
    pp_ifdef
    pp_ifndef
    pp_include
    pp_line
    pp_number
    pp_pragma
    pp_qheader
    pp_undef
    pp_warning
    private
    protected
    public
    question_mark
    register
    reinterpretcast
    return
    rightbrace
    rightbracket
    rightparen
    semicolon
    shiftleft
    shiftleftassign
    shiftright
    shiftrightassign
    short
    signed
    sizeof
    space
    space2
    star
    starassign
    static
    staticcast
    stringlit
    struct
    switch
    template
    this
    throw
    true
    try
    typedef
    typeid
    typename
    union
    unsigned
    using
    virtual
    void
    volatile
    wchart
    while
    xor
    xorassign
    
    

Note

There is a predefined rule named DUMP that prints on the screen all tokens with their types and position. This rule can be helpful as a guideline for creating custom filtering criteria:

  • vera++ --rule DUMP myfile.cpp
    
    

Changes

Vera++ 1.2.1

Vera++ 1.2.1 differs from 1.2.0 in the following ways:

\[bu]

BUGFIX: fix --inputs in order to be able to read the inputs from a file

Vera++ 1.2.0

Vera++ 1.2.0 differs from 1.1.2 in the following ways:

\[bu]

Full Tcl stack printed when a rule fail.

\[bu]

New command line interface that support long and short options. The old style command line is still usable for backward compatibility.

\[bu]

Produce output to standard output by default so the output can easily be piped to another program. The options --warning and --error make vera++ produce its output on the error output.

\[bu]

CMake macros to easily run vera++ in any CMake project.

\[bu]

Easier integration in a test chain by return an error code when at least one report is produced and the --error option is used. --quiet and --summary can also help to better integrate vera++ in the test chain.

\[bu]

The standard output format match gcc\[aq]s output format for a better integration in a build chain.

\[bu]

Can read the list of files to check from one or more files.

\[bu]

Can read the source code to check from the standard input.

\[bu]

Can write the several reports in differents formats and in different places.

\[bu]

Added --root option to point the the vera root directory from the command line and ease the usage of custom rules.

\[bu]

Reports can be produced in checkstyle (http://checkstyle.sourceforge.net/) XML format.

\[bu]

Vera++ no more impose the extension of the source files to check.

\[bu]

Several exclusion files can be used.

\[bu]

Several parameter files can be used.

\[bu]

Build system now uses CMake.

\[bu]

Builds with TCL 8.6.

\[bu]

Don\[aq]t require Boost sources to build.

\[bu]

New documentation generation process to unify the wiki, the html doc and the manpage.

\[bu]

Binary packages for MS Windows and Mac OS X (and others).

\[bu]

Nightly tests to avoid regressions.

\[bu]

New website.

\[bu]

BUGFIX: the rule T019 now works properly with do ... while blocks.

Vera++ 1.1.2

Vera++ 1.1.2 differs from 1.1.1 in the following ways:

\[bu]

Added -xmlreport option.

Vera++ 1.1.1

Vera++ 1.1.1 differs from 1.1.0 in the following ways:

\[bu]

Added -help option.

\[bu]

Updated code for compatibility with newer versions of Boost. The reference version of the Boost library is now 1.35 or 1.36.

\[bu]

BUGFIX: Corrected handling of current directory when neither HOME nor VERA_ROOT is specified (this affects Windows users only).

Vera++ 1.1.0

Vera++ 1.1.0 differs from 1.0.0 in the following ways:

\[bu]

Updated rules:

\[bu]

T002: additionally recognizes redefinition (#define) of keywords

\[bu]

T009: recognizes comment adjacent to colon as an exception to the rule

\[bu]

Added rules:

\[bu]

F001: Source files should not use the \r (CR) character

\[bu]

F002: File names should be well-formed Note: F002 is not part of the default profile.

\[bu]

T012: Negation operator should not be used in its short form

\[bu]

T013: Source files should contain the copyright notice

\[bu]

T014: Source files should refer the Boost Software License Note: T014 is not part of the default profile.

\[bu]

T015: HTML links in comments and string literals should be correct

\[bu]

T016: Calls to min/max should be protected against accidental macro substitution

\[bu]

T017: Unnamed namespaces are not allowed in header files

\[bu]

T018: Using namespace is not allowed in header files

\[bu]

T019: Control structures should have complete curly-braced block of code

\[bu]

Added predefined boost profile to emulate the original Boost inspect tool.

\[bu]

Added transformations:

\[bu]

move_namespace: Changes the given identifier, useful for moving the whole project from one namespace to another.

\[bu]

move_macros: Changes the given prefix in all identifiers, useful for moving the whole set of macros that have common prefix.

\[bu]

move_includes: Changes the given part of #include "..." directives, useful for moving libraries and whole sets of header files.

\[bu]

Added documentation for all available transformations.

\[bu]

Makefiles modified to better support Windows make users.

\[bu]

Extension .ipp added to the list of recognized source file extensions.

\[bu]

New option -showrules includes name of rules in each report line.

\[bu]

Changed the profile definition to be an active Tcl script instead of passive text file.

\[bu]

Added the possibility to define exclusions to rule checks.

\[bu]

BUGFIX: Corrected handling of newline tokens.

AUTHORS

Maciej Sobczak; Vincent Hobeïka; Gaëtan Lehmann.