BASE FUNCTIONALITY

The base module represents concepts of the bootstrapping process that tasks can interact with and handles the gather, sorting and running of tasks.

Filesystem handling

Volume

class bootstrapvz.base.fs.volume.Volume(partition_map)

Represents an abstract volume. This class is a finite state machine and represents the state of the real volume.

_before_link_dm_node(e)

Links the volume using the device mapper This allows us to create a 'window' into the volume that acts like a volum in itself. Mainly it is used to fool grub into thinking that it is working with a real volume, rather than a loopback device or a network block device.

Parameters

e (_e_obj) -- Event object containing arguments to create()

Keyword arguments to link_dm_node() are:

Parameters

  • logical_start_sector (int) -- The sector the volume should start at in the new volume

  • start_sector (int) -- The offset at which the volume should begin to be mapped in the new volume

  • sectors (int) -- The number of sectors that should be mapped

    Read more at: http://manpages.debian.org/cgi-bin/man.cgi?query=dmsetup&apropos=0&sektion=0&manpath=Debian+7.0+wheezy&format=html&locale=en

Raises VolumeError

When a free block device cannot be found.

_before_unlink_dm_node(e)

Unlinks the device mapping

_check_blocking(e)

Checks whether the volume is blocked

Raises VolumeError

When the volume is blocked from being detached

Partitionmaps

Abstract Partitionmap

class bootstrapvz.base.fs.partitionmaps.abstract.AbstractPartitionMap(bootloader)

Abstract representation of a partiton map This class is a finite state machine and represents the state of the real partition map

_before_map(event)

Raises PartitionError

In case a partition could not be mapped.

_before_unmap(event)

Raises PartitionError

If the a partition cannot be unmapped

create(volume)

Creates the partition map

Parameters

volume (Volume) -- The volume to create the partition map on

get_total_size()

Returns the total size the partitions occupy

Returns

The size of all partitions

Return type

Bytes

is_blocking()

Returns whether the partition map is blocking volume detach operations

Return type

bool

map(volume)

Maps the partition map to device nodes

Parameters

volume (Volume) -- The volume the partition map resides on

unmap(volume)

Unmaps the partition

Parameters

volume (Volume) -- The volume to unmap the partition map from

GPT Partitionmap

class bootstrapvz.base.fs.partitionmaps.gpt.GPTPartitionMap(data, bootloader)

Represents a GPT partition map

_before_create(event)

Creates the partition map

MS-DOS Partitionmap

class bootstrapvz.base.fs.partitionmaps.msdos.MSDOSPartitionMap(data, bootloader)

Represents a MS-DOS partition map Sometimes also called MBR (but that confuses the hell out of me, so ms-dos it is)

No Partitionmap

class bootstrapvz.base.fs.partitionmaps.none.NoPartitions(data, bootloader)

Represents a virtual 'NoPartitions' partitionmap. This virtual partition map exists because it is easier for tasks to simply always deal with partition maps and then let the base abstract that away.

get_total_size()

Returns the total size the partitions occupy

Returns

The size of all the partitions

Return type

Bytes

is_blocking()

Returns whether the partition map is blocking volume detach operations

Return type

bool

Partitions

Abstract partition

class bootstrapvz.base.fs.partitions.abstract.AbstractPartition(size, filesystem, format_command)

Abstract representation of a partiton This class is a finite state machine and represents the state of the real partition

class Mount(source, destination, opts)

Represents a mount into the partition

mount(prefix)

Performs the mount operation or forwards it to another partition

Parameters

prefix (str) -- Path prefix of the mountpoint

unmount()

Performs the unmount operation or asks the partition to unmount itself

AbstractPartition._after_mount(e)

Mount any mounts associated with this partition

AbstractPartition._before_format(e)

Formats the partition

AbstractPartition._before_mount(e)

Mount the partition

AbstractPartition._before_unmount(e)

Unmount any mounts associated with this partition

AbstractPartition.add_mount(source, destination, opts=[])

Associate a mount with this partition Automatically mounts it

Parameters

  • source (str,AbstractPartition) -- The source of the mount

  • destination (str) -- The path to the mountpoint

  • opts (list) -- Any options that should be passed to the mount command

AbstractPartition.get_end()

Gets the end of the partition

Returns

The end of the partition

Return type

Bytes

AbstractPartition.get_uuid()

Gets the UUID of the partition

Returns

The UUID of the partition

Return type

str

AbstractPartition.remove_mount(destination)

Remove a mount from this partition Automatically unmounts it

Parameters

destination (str) -- The mountpoint path of the mount that should be removed

Base partition

class bootstrapvz.base.fs.partitions.base.BasePartition(size, filesystem, format_command, previous)

Represents a partition that is actually a partition (and not a virtual one like 'Single')

_before_create(e)

Creates the partition

create(volume)

Creates the partition

Parameters

volume (Volume) -- The volume to create the partition on

get_index()

Gets the index of this partition in the partition map

Returns

The index of the partition in the partition map

Return type

int

get_start()

Gets the starting byte of this partition

Returns

The starting byte of this partition

Return type

Bytes

map(device_path)

Maps the partition to a device_path

Parameters

device_path (str) -- The device patht his partition should be mapped to

GPT partition

class bootstrapvz.base.fs.partitions.gpt.GPTPartition(size, filesystem, format_command, name, previous)

Represents a GPT partition

GPT swap partition

class bootstrapvz.base.fs.partitions.gpt_swap.GPTSwapPartition(size, previous)

Represents a GPT swap partition

MS-DOS partition

class bootstrapvz.base.fs.partitions.msdos.MSDOSPartition(size, filesystem, format_command, previous)

Represents an MS-DOS partition

MS-DOS swap partition

class bootstrapvz.base.fs.partitions.msdos_swap.MSDOSSwapPartition(size, previous)

Represents a MS-DOS swap partition

Single

class bootstrapvz.base.fs.partitions.single.SinglePartition(size, filesystem, format_command)

Represents a single virtual partition on an unpartitioned volume

get_start()

Gets the starting byte of this partition

Returns

The starting byte of this partition

Return type

Bytes

Unformatted partition

class bootstrapvz.base.fs.partitions.unformatted.UnformattedPartition(size, previous)

Represents an unformatted partition It cannot be mounted

Exceptions

exception bootstrapvz.base.fs.exceptions.PartitionError

Raised when an error occurs while interacting with the partitions on the volume

exception bootstrapvz.base.fs.exceptions.VolumeError

Raised when an error occurs while interacting with the volume

Package handling

Package list

class bootstrapvz.base.pkg.packagelist.PackageList(manifest_vars, source_lists)

Represents a list of packages

class Local(path)

A local package

class PackageList.Remote(name, target)

A remote package with an optional target

PackageList.add(name, target=None)

Adds a package to the install list

Parameters

  • name (str) -- The name of the package to install, may contain manifest vars references

  • target (str) -- The name of the target release for the package, may contain manifest vars references

Raises

  • PackageError -- When a package of the same name but with a different target has already been added.

  • PackageError -- When the specified target release could not be found.

PackageList.add_local(package_path)

Adds a local package to the installation list

Parameters

package_path (str) -- Path to the local package, may contain manifest vars references

Sources list

class bootstrapvz.base.pkg.sourceslist.Source(line)

Represents a single source line

class bootstrapvz.base.pkg.sourceslist.SourceLists(manifest_vars)

Represents a list of sources lists for apt

add(name, line)

Adds a source to the apt sources list

Parameters

  • name (str) -- Name of the file in sources.list.d, may contain manifest vars references

  • line (str) -- The line for the source file, may contain manifest vars references

target_exists(target)

Checks whether the target exists in the sources list

Parameters

target (str) -- Name of the target to check for, may contain manifest vars references

Returns

Whether the target exists

Return type

bool

Preferences list

class bootstrapvz.base.pkg.preferenceslist.Preference(preference)

Represents a single preference

class bootstrapvz.base.pkg.preferenceslist.PreferenceLists(manifest_vars)

Represents a list of preferences lists for apt

add(name, preferences)

Adds a preference to the apt preferences list

Parameters

  • name (str) -- Name of the file in preferences.list.d, may contain manifest vars references

  • preferences (object) -- The preferences

Exceptions

exception bootstrapvz.base.pkg.exceptions.PackageError

Raised when an error occurrs while handling the packageslist

exception bootstrapvz.base.pkg.exceptions.SourceError

Raised when an error occurs while handling the sourceslist

Bootstrap information

class bootstrapvz.base.bootstrapinfo.BootstrapInformation(manifest=None, debug=False)

The BootstrapInformation class holds all information about the bootstrapping process. The nature of the attributes of this class are rather diverse. Tasks may set their own attributes on this class for later retrieval by another task. Information that becomes invalid (e.g. a path to a file that has been deleted) must be removed.

_BootstrapInformation__create_manifest_vars(manifest, additional_vars={})

Creates the manifest variables dictionary, based on the manifest contents and additional data.

Parameters

  • manifest (Manifest) -- The Manifest

  • additional_vars (dict) -- Additional values (they will take precedence and overwrite anything else)

Returns

The manifest_vars dictionary

Return type

dict

Manifest

The Manifest module contains the manifest that providers and plugins use to determine which tasks should be added to the tasklist, what arguments various invocations should have etc..

class bootstrapvz.base.manifest.Manifest(path)

This class holds all the information that providers and plugins need to perform the bootstrapping process. All actions that are taken originate from here. The manifest shall not be modified after it has been loaded. Currently, immutability is not enforced and it would require a fair amount of code to enforce it, instead we just rely on tasks behaving properly.

load()

Loads the manifest. This function not only reads the manifest but also loads the specified provider and plugins. Once they are loaded, the initialize() function is called on each of them (if it exists). The provider must have an initialize function.

parse()

Parses the manifest. Well... "parsing" is a big word. The function really just sets up some convenient attributes so that tasks don't have to access information with info.manifest.data['section'] but can do it with info.manifest.section.

schema_validator(data, schema_path)

This convenience function is passed around to all the validation functions so that they may run a json-schema validation by giving it the data and a path to the schema.

Parameters

  • data (dict) -- Data to validate (normally the manifest data)

  • schema_path (str) -- Path to the json-schema to use for validation

validate()

Validates the manifest using the base, provider and plugin validation functions. Plugins are not required to have a validate_manifest function

validation_error(message, json_path=None)

This function is passed to all validation functions so that they may raise a validation error because a custom validation of the manifest failed.

Parameters

  • message (str) -- Message to user about the error

  • json_path (list) -- A path to the location in the manifest where the error occurred

Raises ManifestError

With absolute certainty

Tasklist

The tasklist module contains the TaskList class.

class bootstrapvz.base.tasklist.TaskList(tasks)

The tasklist class aggregates all tasks that should be run and orders them according to their dependencies.

run(info, dry_run=False)

Converts the taskgraph into a list and runs all tasks in that list

Parameters

  • info (dict) -- The bootstrap information object

  • dry_run (bool) -- Whether to actually run the tasks or simply step through them

bootstrapvz.base.tasklist.check_ordering(task)

Checks the ordering of a task in relation to other tasks and their phases.

This function checks for a subset of what the strongly connected components algorithm does, but can deliver a more precise error message, namely that there is a conflict between what a task has specified as its predecessors or successors and in which phase it is placed.

Parameters

task (Task) -- The task to check the ordering for

Raises TaskListError

If there is a conflict between task precedence and phase precedence

bootstrapvz.base.tasklist.create_list(subset)

Creates a list of all the tasks that should be run.

bootstrapvz.base.tasklist.get_all_classes(path=None, prefix='')

Given a path to a package, this function retrieves all the classes in it

Parameters

  • path (str) -- Path to the package

  • prefix (str) -- Name of the package followed by a dot

Returns

A generator that yields classes

Return type

generator

Raises Exception

If a module cannot be inspected.

bootstrapvz.base.tasklist.get_all_tasks()

Gets a list of all task classes in the package

Returns

A list of all tasks in the package

Return type

list

bootstrapvz.base.tasklist.load_tasks(function, manifest, *args)

Calls function on the provider and all plugins that have been loaded by the manifest. Any additional arguments are passed directly to function. The function that is called shall accept the taskset as its first argument and the manifest as its second argument.

Parameters

  • function (str) -- Name of the function to call

  • manifest (Manifest) -- The manifest

  • args (list) -- Additional arguments that should be passed to the function that is called

bootstrapvz.base.tasklist.strongly_connected_components(graph)

Find the strongly connected components in a graph using Tarjan's algorithm.

Source: http://www.logarithmic.net/pfh-files/blog/01208083168/sort.py

Parameters

graph (dict) -- mapping of tasks to lists of successor tasks

Returns

List of tuples that are strongly connected comoponents

Return type

list

bootstrapvz.base.tasklist.topological_sort(graph)

Runs a topological sort on a graph.

Source: http://www.logarithmic.net/pfh-files/blog/01208083168/sort.py

Parameters

graph (dict) -- mapping of tasks to lists of successor tasks

Returns

A list of all tasks in the graph sorted according to ther dependencies

Return type

list

Logging

This module holds functions and classes responsible for formatting the log output both to a file and to the console.

class bootstrapvz.base.log.ConsoleFormatter(fmt=None, datefmt=None)

Formats log statements for the console

class bootstrapvz.base.log.FileFormatter(fmt=None, datefmt=None)

Formats log statements for output to file Currently this is just a stub

bootstrapvz.base.log.get_log_filename(manifest_path)

Returns the path to a logfile given a manifest The logfile name is constructed from the current timestamp and the basename of the manifest

Parameters

manifest_path (str) -- The path to the manifest

Returns

The path to the logfile

Return type

str

bootstrapvz.base.log.setup_logger(logfile=None, debug=False)

Sets up the python logger to log to both a file and the console

Parameters

  • logfile (str) -- Path to a logfile

  • debug (bool) -- Whether to log debug output to the console

Task

class bootstrapvz.base.task.Task

The task class represents a task that can be run. It is merely a wrapper for the run function and should never be instantiated.

classmethod run(info)

The run function, all work is done inside this function

Parameters

info (BootstrapInformation) -- The bootstrap info object.

Phase

class bootstrapvz.base.phase.Phase(name, description)

The Phase class represents a phase a task may be in. It has no function other than to act as an anchor in the task graph. All phases are instantiated in common.phases

pos()

Gets the position of the phase

Returns

The positional index of the phase in relation to the other phases

Return type

int

COMMON

The common module contains features that are common to multiple providers and plugins. It holds both a large set of shared tasks and also various tools that are used by both the base module and tasks.

Volume representations

Shared tasks

PLUGINS

PROVIDERS

DEVELOPMENT GUIDELINES

The following guidelines should serve as general advice when developing providers or plugins for bootstrap-vz. Keep in mind that these guidelines are not rules , they are advice on how to better add value to the bootstrap-vz codebase.

  • The manifest should always fully describe the resulting image. The outcome of a bootstrapping process should never depend on settings specified elsewhere.

    This allows others to easily reproduce any setup other people are running and makes it possible to share manifests. The official debian EC2 images for example can be reproduced using the manifests available in the manifest directory of bootstrap-vz.

  • The bootstrapper should always be able to run fully unattended.

    For end users, this guideline minimizes the risk of errors. Any required input would also be in direct conflict with the previous guideline that the manifest should always fully describe the resulting image.

    Additionally developers may have to run the bootstrap process multiple times though, any prompts in the middle of that process may significantly slow down the development speed.

  • The bootstrapper should only need as much setup as the manifest requires.

    Having to shuffle specific paths on the host into place (e.g. /target has to be created manually) to get the bootstrapper running is going to increase the rate of errors made by users. Aim for minimal setup.

    Exceptions are of course things such as the path to the VirtualBox Guest Additions ISO or tools like parted that need to be installed on the host.

  • Roll complexity into which tasks are added to the tasklist.

    If a run() function checks whether it should do any work or simply be skipped, consider doing that check in resolve_tasks() instead and avoid adding that task alltogether. This allows people looking at the tasklist in the logfile to determine what work has been performed. If a task says it will modify a file but then bails , a developer may get confused when looking at that file after bootstrapping. He could conclude that the file has either been overwritten or that the search & replace does not work correctly.

  • Control flow should be directed from the task graph.

    Avoid creating complicated run() functions. If necessary, split up a function into two semantically separate tasks.

    This allows other tasks to interleave with the control-flow and add extended functionality (e.g. because volume creation and mounting are two separate tasks, the prebootstrapped plugin can replace the volume creation task with a task of its own that creates a volume from a snapshot instead, but still reuse the mount task).

  • Task classes should be treated as decorated run() functions, they should not have any state

    Thats what the BootstrapInformation object is for.

  • Only add stuff to the BootstrapInformation object when really necessary.

    This is mainly to avoid clutter.

  • Use a json-schema to check for allowed settings The json-schema may be verbose but it keeps the bulk of check work outside the python code, which is a big plus when it comes to readability. This of course only applies bas long as the checks are simple. You can of course fall back to doing the check in python when that solution is considerably less complex.

  • When invoking external programs, use long options whenever possible

    This makes the commands a lot easier to understand, since the option names usually hint at what they do.

  • When invoking external programs, don't use full paths, rely on \(ga\(ga$PATH\(ga\(ga

    This increases robustness when executable locations change. Example: Use log_call(['wget', ...]) instead of log_call(['/usr/bin/wget', ...]).

Coding style

bootstrap-vz is coded to comply closely with the PEP8 style guidelines. There however a few exceptions:

  • Max line length is 110 chars, not 80.

  • Multiple assignments may be aligned with spaces so that the = match vertically.

  • Ignore E101: Indent with tabs and align with spaces

  • Ignore E221 & E241: Alignment of assignments

  • Ignore E501: The max line length is not 80 characters

  • Ignore W191: Indent with tabs not spaces

    The codebase can be checked for any violations quite easily, since those rules are already specified in the tox configuration file.

    tox -e flake8
    

TASKOVERVIEW

HOW BOOTSTRAP-VZ WORKS

Tasks

At its core bootstrap-vz is based on tasks that perform units of work. By keeping those tasks small and with a solid structure built around them a high degree of flexibility can be achieved. To ensure that tasks are executed in the right order, each task is placed in a dependency graph where directed edges dictate precedence. Each task is a simple class that defines its predecessor tasks and successor tasks via attributes. Here is an example:

class MapPartitions(Task):
    description = 'Mapping volume partitions'
    phase = phases.volume_preparation
    predecessors = [PartitionVolume]
    successors = [filesystem.Format]

    @classmethod
    def run(cls, info):
            info.volume.partition_map.map(info.volume)

In this case the attributes define that the task at hand should run after the PartitionVolume task — i.e. after volume has been partitioned (predecessors) — but before formatting each partition (successors). It is also placed in the volume_preparation phase. Phases are ordered and group tasks together. All tasks in a phase are run before proceeding with the tasks in the next phase. They are a way of avoiding the need to list 50 different tasks as predecessors and successors.

The final task list that will be executed is computed by enumerating all tasks in the package, placing them in the graph and sorting them topoligcally. Subsequently the list returned is filtered to contain only the tasks the provider and the plugins added to the taskset.

System abstractions

There are several abstractions in bootstrap-vz that make it possible to generalize things like volume creation, partitioning, mounting and package installation. As a rule these abstractions are located in the base/ folder, where the manifest parsing and task ordering algorithm are placed as well.

COMMANDLINE SWITCHES

As a developer, there are commandline switches available which can make your life a lot easier.

  • --debug: Enables debug output in the console. This includes output from all commands that are invoked during bootstrapping.

  • --pause-on-error: Pauses the execution when an exception occurs before rolling back. This allows you to debug by inspecting the volume at the time the error occured.

  • --dry-run: Prevents the run() function from being called on all tasks. This is useful if you want to see whether the task order is correct.

LOGFILE

Every run creates a new logfile in the logs/ directory. The filename for each run consists of a timestamp (%Y%m%d%H%M%S) and the basename of the manifest used. The log also contains debugging statements regardless of whether the --debug switch was used.

  • genindex

  • modindex

  • search

AUTHOR

Anders Ingemann

COPYRIGHT

2014, Anders Ingemann