Resource configuration

The resource configuration (resconfig) includes all resources used in WFEngine. These are the computing resources provided by a batch system such as Slurm, the Launchpad databases (MongoDB), and runtime environments. The resconfig includes a list of workers and a default worker. The worker is mapped to the computing cluster / computer used by WFEngine. If the cluster / computer has Slurm then queues and computing resources are configured.

Location of resource configuration file

The default location of the resource configuration is $HOME/.fireworks/res_config.yaml. This path can be overridden by setting the environment variable RESCONFIG_LOC. The actual path is returned by the function resconfig.get_resconfig_loc().

Create a resource configuration from scratch

Whether a resconfig file already exists, can be checked with this:

from os import path
from virtmat.middleware.resconfig import get_resconfig_loc
path.exists(get_resconfig_loc()) # False

If, as in the example, a resconfig does not yet exist a new resconfig can be created and written to the default resconfig location:

from virtmat.middleware.resconfig import ResConfig, set_defaults_from_guess
from virtmat.middleware.resconfig import get_resconfig_loc

cfg = ResConfig.from_scratch()
set_defaults_from_guess(cfg.default_worker)
cfg.to_file(get_resconfig_loc())

With this one worker configuration is created and set as default.

Add a worker to an existing resconfig

If a resconfig with configured workers already exists, one can add a worker for the current system:

path.exists(get_resconfig_loc()) # True
cfg = ResConfig.from_file(get_resconfig_loc())
cfg.add_worker_from_scratch(default=False)
cfg.to_file(get_resconfig_loc())

To set the new worker as the default worker, one should call with default=True.

Worker configuration

Every computer or computing cluster is represented by a worker. The worker has a name, a list of queues and a list of group names that can be used for accounting when a job is submitted to some queue. A worker configuration can be created using the WorkerConfig class:

from virtmat.middleware.resconfig import WorkerConfig
wcfg = WorkerConfig(name='w_name')

In this case, the queues and the accounts have to be set manually. A more rapid method to configure a worker is to use the class method from_scratch():

wcfg = WorkerConfig.from_scratch()

In both cases, one has to set the default queue and default acount:

# set the first queue as default:
wcfg.set_default_queue()
# set the current group (guid) as default:
wcfg.set_default_account()

One can also set a queue and a group explicitly:

wcfg.default_queue = qcfg
wcfg.default_account = 'group_name'

If qcfg or group_name are not in the lists queues and accounts, respectively, a ResourceConfigurationError is raised.

The from_scratch() method also sets up a list of environment modules provided on the computing cluster that is accessible via the modules attribute.

Queue configuration

The queue configuration accommodates computing resource configurations (attribute resources). Further attributes are name, public, accounts_allow, accounts_deny, and groups_allow. A queue configuration is created using the QueueConfig class:

qcfg = QueueConfig()

With the get_resource() method a specific resource configuration can be retrieved:

print(qcfg.get_resource('time'))

A resource can be set with set_resource() method, e.g. to set the default walltime to 5 minutes:

qcfg.set_resource('time', 'default', 5)

If the resource does not exist, it is created and added to list of resources in the queue. The second argument, the resource type, may be one of minimum, maximum or default. If other keyword is used then a ValueError will be raised.

With the method validate_resource(), a resource can be validated, e.g. to check whether time of 4000 minutes may be requested in this queue:

qcfg.validate_resource('time', 4000)

In case the resource value exceeds the limits, then a ResourceConfigurationError is raised.

Computing resource configuration

This configuration includes one single resource. The public attributes are name, minimum, maximum and default. Here an example of creating and adding a resource to the list of resources of a queue:

from virtmat.middleware.resconfig import ResourceConfig 
time = ResourceConfig('time')
time.minimum = 1
time.maximum = 100
time.default = 5
qcfg.resources.append(time)

Configuration of environment modules

Environment modules are commonly used on computing clusters to set up the run-time environment of various software. Thus they can be used as a discovery source (registry) for software deployed on the cluster. The attributes of the ModuleConfig class are:

  • prefix (str): module prefix, if module name has prefix, else None

  • name (str): module name, may not be None

  • versions ([str]): available module versions, empty list if no versions specified

  • path (str): path to module file, if different from default modules path, else None

Example: The module files chem/gromacs/2022.6 and chem/gromacs/2022.5 that are in the default modules path are represented in resconfig as the object mcfg = ModuleConfig(prefix='chem', name='gromacs', versions=['2022.6', '2022.5']). The command module load chem/gromacs/2022.6 can be retrieved by mcfg.get_command('gromacs', '==2022.6'). In case of no match, e.g. mcfg.get_command('gromacs', '>2022.6') will return None.