SYNOPSIS

sacct [OPTIONS...]

DESCRIPTION

Accounting information for jobs invoked with SLURM are either logged in the job accounting log file or saved to the SLURM database.

The sacct command displays job accounting data stored in the job accounting log file or SLURM database in a variety of forms for your analysis. The sacct command displays information on jobs, job steps, status, and exitcodes by default. You can tailor the output with the use of the --format= option to specify the fields to be shown.

For the root user, the sacct command displays job accounting data for all users, although there are options to filter the output to report only the jobs from a specified user or group.

For the non-root user, the sacct command limits the display of job accounting data to jobs that were launched with their own user identifier (UID) by default. Data for other users can be displayed with the --allusers, --user, or --uid options.

Note: If designated, the slurmdbd.conf option PrivateData may further

restrict the accounting data visible to users which are not SlurmUser, root, or a user with AdminLevel=Admin. See the slurmdbd.conf man page for additional details on restricting access to accounting data.

Note: If the AccountingStorageType is set to "accounting_storage/filetxt",

space characters embedded within account names, job names, and step names will be replaced by underscores. If account names with embedded spaces are needed, it is recommended that a database type of accounting storage be configured.

Note: The content's of SLURM's database are maintained in lower case. This may

result in some sacct output differing from that of other SLURM commands.

Note: Much of the data reported by sacct has been generated by

the wait3() and getrusage() system calls. Some systems gather and report incomplete information for these calls; sacct reports values of 0 for this missing data. See your systems getrusage (3) man page for information about which data are actually available on your system.

  • Elapsed time fields are presented as [days-]hours:minutes:seconds[.microseconds]. Only 'CPU' fields will ever have microseconds.

  • The default input file is the file named in the AccountingStorageLoc parameter in slurm.conf.

OPTIONS

-a, --allusers

Displays all users jobs when run by user root or if PrivateData is not configured to jobs. Otherwise display the current user's jobs

-A account_list , --accounts=account_list

Displays jobs when a comma separated list of accounts are given as the argument.

-b, --brief

Displays a brief listing, which includes the following data:

jobid

status

exitcode

-c, --completion

Use job completion instead of job accounting. The JobCompType parameter in the slurm.conf file must be defined to a non-none option.

-D, --duplicates

If SLURM job ids are reset, some job numbers will probably appear more than once in the accounting log file but refer to different jobs. Such jobs can be distinguished by the "submit" time stamp in the data records.

  • When data for specific jobs are requested with the --jobs option, sacct returns the most recent job with that number. This behavior can be overridden by specifying --duplicates, in which case all records that match the selection criteria will be returned.

-e, --helpformat

  • Print a list of fields that can be specified with the --format option.

  • Fields available:
    
    AllocCPUS       Account        AssocID          AveCPU
    AveCPUFreq      AveDiskRead    AveDiskWrite     AvePages
    AveRSS          AveVMSize      BlockID          Cluster
    Comment         ConsumedEnergy CPUTime          CPUTimeRAW
    DerivedExitCode Elapsed        Eligible         End
    ExitCode        GID            Group            JobID
    JobName         Layout         MaxDiskRead      MaxDiskReadNode
    MaxDiskReadTask MaxDiskWrite   MaxDiskWriteNode MaxDiskWriteTask
    MaxPages        MaxPagesNode   MaxPagesTask     MaxRSS
    MaxRSSNode      MaxRSSTask     MaxVMSize        MaxVMSizeNode
    MaxVMSizeTask   MinCPU         MinCPUNode       MinCPUTask
    NCPUS           NNodes         NodeList         NTasks
    Priority        Partition      QOSRAW           ReqCPUFreq
    ReqCPUs         ReqMem         Reserved         ResvCPU
    ResvCPURAW      Start          State            Submit
    Suspended       SystemCPU      Timelimit        TotalCPU
    UID             User           UserCPU          WCKey
    WCKeyID
    
    
  • The section titled "Job Accounting Fields" describes these fields.

-E end_time, --endtime=end_time

  • Select jobs in any state before the specified time. If states are given with the -s option return jobs in this state before this period.

    Valid time formats are...

    HH:MM[:SS] [AM|PM]

    MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]

    MM/DD[/YY]-HH:MM[:SS]

    YYYY-MM-DD[THH:MM[:SS]]

-f file, --file=file

Causes the sacct command to read job accounting data from the named file instead of the current SLURM job accounting log file. Only applicable when running the filetxt plugin.

-g gid_list, --gid=gid_list --group=group_list

Displays the statistics only for the jobs started with the GID or the GROUP specified by the gid_list or thegroup_list operand, which is a comma-separated list. Space characters are not allowed. Default is no restrictions..

-h, --help

Displays a general help message.

-j job(.step) , --jobs=job(.step)

Displays information about the specified job(.step) or list of job(.step)s.

  • The job(.step) parameter is a comma-separated list of jobs. Space characters are not permitted in this list. NOTE: A step id of 'batch' will display the information about the batch step. The batch step information is only available after the batch job is complete unlike regular steps which are available when they start.

  • The default is to display information on all jobs.

-k, --timelimit-min

Only send data about jobs with this timelimit. If used with timelimit_max this will be the minimum timelimit of the range. Default is no restriction.

-K, --timelimit-max

Ignored by itself, but if timelimit_min is set this will be the maximum timelimit of the range. Default is no restriction.

-l, --long

Equivalent to specifying:

  • --format=jobid,jobname,partition,maxvmsize,maxvmsizenode,maxvmsizetask, avevmsize,maxrss,maxrssnode,maxrsstask,averss,maxpages,maxpagesnode, maxpagestask,avepages,mincpu,mincpunode,mincputask,avecpu,ntasks, alloccpus,elapsed,state,exitcode,maxdiskread,maxdiskreadnode,maxdiskreadtask, avediskread,maxdiskwrite,maxdiskwritenode,maxdiskwritetask,avediskwrite

-L, --allclusters

Display jobs ran on all clusters. By default, only jobs ran on the cluster from where sacct is called are displayed.

-M cluster_list, --clusters=cluster_list

Displays the statistics only for the jobs started on the clusters specified by the cluster_list operand, which is a comma-separated list of clusters. Space characters are not allowed in the cluster_list. Use -1 for all clusters. The default is current cluster you are executing the sacct command on.

-n, --noheader

No heading will be added to the output. The default action is to display a header.

-N node_list, --nodelist=node_list

Display jobs that ran on any of these node(s). node_list can be a ranged string.

--name=jobname_list

Display jobs that have any of these name(s).

-o, --format

Comma separated list of fields. (use "--helpformat" for a list of available fields).

NOTE: When using the format option for listing various fields you can put a %NUMBER afterwards to specify how many characters should be printed.

e.g. format=name%30 will print 30 characters of field name right justified. A %-30 will print 30 characters left justified.

When set, the SACCT_FORMAT environment variable will override the default format. For example:

SACCT_FORMAT="jobid,user,account,cluster"

-p, --parsable

output will be '|' delimited with a '|' at the end

-P, --parsable2

output will be '|' delimited without a '|' at the end

-q, --qos

Only send data about jobs using these qos. Default is all.

-r, --partition

Comma separated list of partitions to select jobs and job steps from. The default is all partitions.

-s state_list , --state=state_list

Selects jobs based on their state during the time period given. Unless otherwise specified, the start and end time will be the current time when the --state option is specified and only currently running jobs can be displayed. A start and/or end time must be specified to view information about jobs not currently running. The following state designators are valid and multiple state names may be specified using comma separators. Either the short or long form of the state name may be used (e.g. CA or CANCELLED) and the the the name is case insensitive (e.g. ca and CA both work).

BF BOOT_FAIL

Job terminated due to launch failure, typically due to a hardware failure (e.g. unable to boot the node or block and the job can not be requeued).

CA CANCELLED

Job was explicitly cancelled by the user or system administrator. The job may or may not have been initiated.

CD COMPLETED

Job has terminated all processes on all nodes.

CF CONFIGURING

Job has been allocated resources, but are waiting for them to become ready for use (e.g. booting).

CG COMPLETING

Job is in the process of completing. Some processes on some nodes may still be active.

F FAILED

Job terminated with non-zero exit code or other failure condition.

NF NODE_FAIL

Job terminated due to failure of one or more allocated nodes.

PD PENDING

Job is awaiting resource allocation. Note for a job to be selected in this state it must have "EligibleTime" in the requested time interval or different from "Unknown". The "EligibleTime" is displayed by the "scontrol show job" command. For example jobs submitted with the "--hold" option will have "EligibleTime=Unknown" as they are pending indefinitely.

PR PREEMPTED

Job terminated due to preemption.

R RUNNING

Job currently has an allocation.

RS RESIZING

Job is about to change size.

S SUSPENDED

Job has an allocation, but execution has been suspended.

TO TIMEOUT

Job terminated upon reaching its time limit.

  • The state_list operand is a comma-separated list of these state designators. Space characters are not allowed in the state_list NOTE: When specifying states and no start time is given the default starttime is 'now'. .

-S, --starttime

Select jobs in any state after the specified time. Default is 00:00:00 of the current day, unless '-s' is set then the default is 'now'. If states are given with the '-s' option then only jobs in this state at this time will be returned.

Valid time formats are...

HH:MM[:SS] [AM|PM]

MMDD[YY] or MM/DD[/YY] or MM.DD[.YY]

MM/DD[/YY]-HH:MM[:SS]

YYYY-MM-DD[THH:MM[:SS]]

-T, --truncate

Truncate time. So if a job started before --starttime the start time would be truncated to --starttime. The same for end time and --endtime.

-u uid_list, --uid=uid_list, --user=user_list

Use this comma separated list of uids or user names to select jobs to display. By default, the running user's uid is used.

--usage

Display a command usage summary.

-v, --verbose

Primarily for debugging purposes, report the state of various variables during processing.

-V, --version

Print version.

-W wckey_list, --wckeys=wckey_list

Displays the statistics only for the jobs started on the wckeys specified by the wckey_list operand, which is a comma-separated list of wckey names. Space characters are not allowed in the wckey_list. Default is all wckeys.

-x associd_list, --associations=assoc_list

Displays the statistics only for the jobs running under the association ids specified by the assoc_list operand, which is a comma-separated list of association ids. Space characters are not allowed in the assoc_list. Default is all associations.

-X, --allocations

Only show cumulative statistics for each job, not the intermediate steps.

Job Accounting Fields

The following describes each job accounting field:

ALL

Print all fields listed below.

AllocCPUs

Count of allocated CPUs. Equivalant to NCPUs.

account

Account the job ran under.

associd

Reference to the association of user, account and cluster.

AveCPU

Average (system + user) CPU time of all tasks in job.

AveCPUFreq

Average weighted CPU frequency of all tasks in job, in kHz.

AveDiskRead

Average number of bytes read by all tasks in job.

AveDiskWrite

Average number of bytes written by all tasks in job.

AvePages

Average number of page faults of all tasks in job.

AveRSS

Average resident set size of all tasks in job.

AveVMSize

Average Virtual Memory size of all tasks in job.

blockid

Block ID, applicable to BlueGene computers only.

cluster

Cluster name.

Comment

The job's comment string when the AccountingStoreJobComment parameter in the slurm.conf file is set (or defaults) to YES. The Comment string can be modified by invoking sacctmgr modify job or the specialized sjobexitmod command.

ConsumedEnergy

Total energy consumed by all tasks in job, in joules. Note: Only in case of exclusive job allocation this value reflects the jobs' real energy consumption.

CPUTime

Formatted (Elapsed time * CPU) count used by a job or step.

CPUTimeRaw

Unlike above non formatted (Elapsed time * CPU) count for a job or step. Units are cpu-seconds.

DerivedExitCode

The highest exit code returned by the job's job steps (srun invocations). Following the colon is the signal that caused the process to terminate if it was terminated by a signal. The DerivedExitCode can be modified by invoking sacctmgr modify job or the specialized sjobexitmod command.

elapsed

The jobs elapsed time.

  • The format of this fields output is as follows:

    [DD-[hh:]]mm:ss

  • as defined by the following:

    DD

    days

    hh

    hours

    mm

    minutes

    ss

    seconds

eligible

When the job became eligible to run.

end

Termination time of the job. Format output is, YYYY-MM-DDTHH:MM:SS, unless changed through the SLURM_TIME_FORMAT environment variable.

exitcode

The exit code returned by the job script or salloc, typically as set by the exit() function. Following the colon is the signal that caused the process to terminate if it was terminated by a signal.

gid

The group identifier of the user who ran the job.

group

The group name of the user who ran the job.

JobID

The number of the job or job step. It is in the form: job.jobstep .

jobname

The name of the job or job step. The slurm_accounting.log file is a space delimited file. Because of this if a space is used in the jobname an underscore is substituted for the space before the record is written to the accounting file. So when the jobname is displayed by sacct the jobname that had a space in it will now have an underscore in place of the space.

layout

What the layout of a step was when it was running. This can be used to give you an idea of which node ran which rank in your job.

MaxDiskRead

Maximum number of bytes read by all tasks in job.

MaxDiskReadNode

The node on which the maxdiskread occurred.

MaxDiskReadTask

The task ID where the maxdiskread occurred.

MaxDiskWrite

Maximum number of bytes written by all tasks in job.

MaxDiskWriteNode

The node on which the maxdiskwrite occurred.

MaxDiskWriteTask

The task ID where the maxdiskwrite occurred.

MaxPages

Maximum number of page faults of all tasks in job.

MaxPagesNode

The node on which the maxpages occurred.

MaxPagesTask

The task ID where the maxpages occurred.

MaxRSS

Maximum resident set size of all tasks in job.

MaxRSSNode

The node on which the maxrss occurred.

MaxRSSTask

The task ID where the maxrss occurred.

MaxVMSize

Maximum Virtual Memory size of all tasks in job.

MaxVMSizeNode

The node on which the maxvmsize occurred.

MaxVMSizeTask

The task ID where the maxvmsize occurred.

MinCPU

Minimum (system + user) CPU time of all tasks in job.

MinCPUNode

The node on which the mincpu occurred.

MinCPUTask

The task ID where the mincpu occurred.

ncpus

Count of allocated CPUs. Equivalant to AllocCPUs

Total number of CPUs allocated to the job.

nodelist

List of nodes in job/step.

nnodes

Number of nodes in a job or step.

NTasks

Total number of tasks in a job or step.

priority

Slurm priority.

partition

Identifies the partition on which the job ran.

qos

Name of Quality of Service.

qosraw

Id of Quality of Service.

ReqCPUFreq

Requested CPU frequency for the step, in kHz. Note: This value applies only to a job step. No value is reported for the job.

reqcpus

Required CPUs.

ReqMem

Minimum required memory for the job, in MB. A 'c' at the end of number represents Memory Per CPU, a 'n' represents Memory Per Node. Note: This value is only from the job allocation, not the step.

reserved

How much wall clock time was used as reserved time for this job. This is derived from how long a job was waiting from eligible time to when it actually started.

resvcpu

Formatted time for how long (cpu secs) a job was reserved for.

resvcpuraw

Reserved CPUs in second format, not formatted.

start

Initiation time of the job in the same format as end.

state

Displays the job status, or state.

Output can be RUNNING, RESIZING, SUSPENDED, COMPLETED, CANCELLED, FAILED, TIMEOUT, PREEMPTED, BOOT_FAIL or NODE_FAIL. If more information is available on the job state than will fit into the current field width (for example, the uid that CANCELLED a job) the state will be followed by a "+". You can increase the size of the displayed state using the "%NUMBER" format modifier described earlier.

submit

The time and date stamp (in Universal Time Coordinated, UTC) the job was submitted. The format of the output is identical to that of the end field.

NOTE: If a job is requeued, the submit time is reset. To obtain the original submit time it is necessary to use the -D or --duplicate option to display all duplicate entries for a job.

suspended

How long the job was suspended for.

SystemCPU

The amount of system CPU time used by the job or job step. The format of the output is identical to that of the elapsed field.

NOTE: SystemCPU provides a measure of the task's parent process and does not include CPU time of child processes.

timelimit

What the timelimit was/is for the job.

TotalCPU

The sum of the SystemCPU and UserCPU time used by the job or job step. The total CPU time of the job may exceed the job's elapsed time for jobs that include multiple job steps. The format of the output is identical to that of the elapsed field.

NOTE: TotalCPU provides a measure of the task's parent process and does not include CPU time of child processes.

uid

The user identifier of the user who ran the job.

user

The user name of the user who ran the job.

UserCPU

The amount of user CPU time used by the job or job step. The format of the output is identical to that of the elapsed field.

NOTE: UserCPU provides a measure of the task's parent process and does not include CPU time of child processes.

wckey

Workload Characterization Key. Arbitrary string for grouping orthogonal accounts together.

wckeyid

Reference to the wckey.

ENVIRONMENT VARIABLES

Some sacct options may be set via environment variables. These environment variables, along with their corresponding options, are listed below. (Note: Commandline options will always override these settings.)

SLURM_TIME_FORMAT

Specify the format used to report time stamps. A value of standard, the default value, generates output in the form "year-month-dateThour:minute:second". A value of relative returns only "hour:minute:second" if the current day. For other dates in the current year it prints the "hour:minute" preceded by "Tomorr" (tomorrow), "Ystday" (yesterday), the name of the day for the coming week (e.g. "Mon", "Tue", etc.), otherwise the date (e.g. "25 Apr"). For other years it returns a date month and year without a time (e.g. "6 Jun 2012"). All of the time stamps use a 24 hour format.

A valid strftime() format can also be specified. For example, a value of "%a %T" will report the day of the week and a time stamp (e.g. "Mon 12:34:56").

EXAMPLES

This example illustrates the default invocation of the sacct command:

# sacct
Jobid      Jobname    Partition    Account AllocCPUS State     ExitCode
---------- ---------- ---------- ---------- ---------- ---------- --------
2          script01   srun       acct1               1 RUNNING           0
3          script02   srun       acct1               1 RUNNING           0
4          endscript  srun       acct1               1 RUNNING           0
4.0                   srun       acct1               1 COMPLETED         0

This example shows the same job accounting information with the brief option.

# sacct --brief
     Jobid     State  ExitCode
---------- ---------- --------
2          RUNNING           0
3          RUNNING           0
4          RUNNING           0
4.0        COMPLETED         0
# sacct --allocations
Jobid      Jobname    Partition Account    AllocCPUS  State     ExitCode
---------- ---------- ---------- ---------- ------- ---------- --------
3          sja_init   andy       acct1            1 COMPLETED         0
4          sjaload    andy       acct1            2 COMPLETED         0
5          sja_scr1   andy       acct1            1 COMPLETED         0
6          sja_scr2   andy       acct1           18 COMPLETED         2
7          sja_scr3   andy       acct1           18 COMPLETED         0
8          sja_scr5   andy       acct1            2 COMPLETED         0
9          sja_scr7   andy       acct1           90 COMPLETED         1
10         endscript  andy       acct1          186 COMPLETED         0

This example demonstrates the ability to customize the output of the sacct command. The fields are displayed in the order designated on the command line.

# sacct --format=jobid,elapsed,ncpus,ntasks,state
     Jobid    Elapsed      Ncpus   Ntasks     State
---------- ---------- ---------- -------- ----------
3            00:01:30          2        1 COMPLETED
3.0          00:01:30          2        1 COMPLETED
4            00:00:00          2        2 COMPLETED
4.0          00:00:01          2        2 COMPLETED
5            00:01:23          2        1 COMPLETED
5.0          00:01:31          2        1 COMPLETED

This example demonstrates the use of the -T (--truncate) option when used with -S (--starttime) and -E (--endtime). When the -T option is used, the start time of the job will be the specified -S value if the job was started before the specified time, otherwise the time will be the job's start time. The end time will be the specified -E option if the job ends after the specified time, otherwise it will be the jobs end time.

NOTE: If no -s (--state) option is given sacct will display jobs that ran durning the specified time, otherwise it returns jobs that were in the state requested durning that period of time.

Without -T (normal operation) sacct output would be like this.

# sacct -S2014-07-03-11:40 -E2014-07-03-12:00 -X -ojobid,start,end,state
    JobID                 Start                  End        State
--------- --------------------- -------------------- ------------
2         2014-07-03T11:33:16   2014-07-03T11:59:01   COMPLETED
3         2014-07-03T11:35:21   Unknown               RUNNING
4         2014-07-03T11:35:21   2014-07-03T11:45:21   COMPLETED
5         2014-07-03T11:41:01   Unknown               RUNNING

By adding the -T option the job's start and end times are truncated to reflect only the time requested. If a job started after the start time requested or finished before the end time requested those times are not altered. The -T option is useful when determining exact run times durning any given period.

# sacct -T -S2014-07-03-11:40 -E2014-07-03-12:00 -X -ojobid,jobname,user,start,end,state
    JobID                 Start                  End        State
--------- --------------------- -------------------- ------------
2         2014-07-03T11:40:00   2014-07-03T11:59:01   COMPLETED
3         2014-07-03T11:40:00   2014-07-03T12:00:00   RUNNING
4         2014-07-03T11:40:00   2014-07-03T11:45:21   COMPLETED
5         2014-07-03T11:41:01   2014-07-03T12:00:00   RUNNING


COPYING

Copyright (C) 2005-2007 Copyright Hewlett-Packard Development Company L.P.

Copyright (C) 2008-2010 Lawrence Livermore National Security. Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).

Copyright (C) 2010-2014 SchedMD LLC.

This file is part of SLURM, a resource management program. For details, see <http://slurm.schedmd.com/>.

SLURM is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

SLURM is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

FILES

/etc/slurm.conf

Entries to this file enable job accounting and designate the job accounting log file that collects system job accounting.

/var/log/slurm_accounting.log

The default job accounting log file. By default, this file is set to read and write permission for root only.

RELATED TO sacct…

sstat(1), ps (1), srun(1), squeue(1), getrusage (2), time (2)