Program to control the kernel hardware latency detection module
hwlatdetect [ --duration=<time> ] [--threshold=<usecs> ] [--window=<time interval> ] [--width=<time interval> ] [--report=<path> ] [--cleanup ] [--debug ] [--quiet ]
hwlatdetect is a program that controls the kernel hardware latency detector module (hwlat_detector.ko). The module is a special purpose kernel module that is used to detect large system latencies induced by the behavior of certain underlying hardware or firmware, independent of Linux itself. The code was developed originally to detect SMIs (System Management Interrupts) on x86 systems, however there is nothing x86 specific about this patchset. It was originally written for use by the "RT" patch since the Real Time kernel is highly latency sensitive.
SMIs are usually not serviced by the Linux kernel, which typically does not even know that they are occuring. SMIs are instead are set up by BIOS code and are serviced by BIOS code, usually for "critical" events such as management of thermal sensors and fans. Sometimes though, SMIs are used for other tasks and those tasks can spend an inordinate amount of time in the handler (sometimes measured in milliseconds). Obviously this is a problem if you are trying to keep event service latencies down in the microsecond range.
The hardware latency detector module works by hogging all of the cpus for configurable amounts of time (by calling stop_machine()), polling the CPU Time Stamp Counter for some period, then looking for gaps in the TSC data. Any gap indicates a time when the polling was interrupted and since the machine is stopped and interrupts turned off the only thing that could do that would be an SMI.
The hwlatdetector script manages the mounting/unmounting of the debugfs as well as the loading/unloading of the hwlat_detector module. If the debugfs is already mounted then hwlatdetector will not unmount it after a run. Likewise, if the hwlat_detector module is already loaded, it will not be unloaded after a run.
Run the detector logic in for the specified duration. The duration is a base 10 integer number that defaults to a value in seconds. An optional suffix may be specified to indicate minutes, hours or days.
Specify the TSC gap used to detect an SMI. Any gap value greater than <theshold> is considered to be the result of an SMI occuring.
specify the size of the sample window. Converted to microseconds when passed to the kernel module.
The amount of time within the sample window where the detector is actually sampling. Must be less than the --window value.
Specify the output filename of the detector report. Default behavior is to print to standard output
Force unload of hwlat_detector.ko and unmounting of debugfs filesystem.
Turn on debug prints
Turn off all information prints