Overview
This guide covers how to integrate the CAD Job Manager with some common job schedulers. After following these steps, running a simulation on a cluster will be just as seamless as running it on your local computer. You will be able to just click “Run” and the job will automatically be submitted to the job schedulers queue (the file will be transferred if necessary), the Lumerical Job Manager will periodically update with the simulation's current state and progress, and upon completion, the results will be automatically loaded in your current CAD session.
See also: Job scheduler submission scripts (SGE, Slurm, Torque) for additional details and sample submission scripts for the supported job schedulers.
Known Limitations
- The job manager options Quit & Don't Save and Quit & Save are currently not supported. This applies both to single simulations and parameter sweeps that has a job scheduler as a resource.
- When Force Quit is clicked, the submitted jobs are not canceled, and needs to be cancelled manually.
- When using the file upload and download over SSH functionality, transfer status is not reflected in the Job Manager status view. The Job Manager window will remain open while the file is being downloaded in the background and close once this process is completed.
Requirements
- Lumerical Products 2020a R3 (or newer)
- Ansys Lumerical 2023 R2.2 (or newer) for user-based configuration
- Cluster with either: Slurm, Torque, LSF, SGE, X11-Display, Lumerical products installed with license server configured. If using AWS ParallelCluster, see AWS-ParallelCluster documentation for details.
Resource Configuration
To configure a cluster resource, follow these steps:
- Open the resource configuration window, and add a new resource.
- Set the command, submission script, and any additional environment variables. See section below on various notes on these settings. Typically, you need to modify the number of nodes, processes per node, and setup script, such that it is appropriate for you cluster setup, simulation environment, and simulation needs.
- Press “Ok” to confirm the resource to confirm the changes.
Resource configuration for a cluster is now complete, and you can proceed with simulation.
Notes on advanced settings for clusters
The points below show various important points for filling out fields in the resource configuration advanced options window:
- The selected job scheduler preset populates fields with example settings and might not apply to your cluster.
- You can use the macros
{NUM_NODES},{NUM_PROC}, and{NUM_THREADS}to refer to values in the resource configuration window. The macros correspond to the following values:-
{NUM_NODES}– The “IP/Hostname/Nodes” column in the resource configuration window. Numbers and strings are supported. -
{NUM_PROC}– The “Processes” column in the resource configuration window. Only numbers are supported. -
{NUM_THREADS}– The column “Threads” in the resource configuration window. Only numbers are supported.
-
- You can use the {PROJECT_FILE_PATH} macro to refer to the current project file path.
- When submitting jobs using Slurm, please ensure flags to request resources such as CPU, GPU, and Memory on the compute cluster are set (e.g., through --cpus-per-task, --gpus-per-node, and --mem). Otherwise, the simulation may fail or may not use all available hardware. For more information regarding flags to request resources, please refer to Slurm documentation of sbatch.
Submitting jobs to your job scheduler from your local computer
If you would like to launch jobs from your local computer for a more seamless experience, you can configure Lumerical on your local computer following the same steps as above and enable ssh by setting up the file "job_scheduler_input.json". The configuration with a JSON file was introduced with Lumerical 2023 R2.2 release. With previous versions, you will have to modify the .py file corresponding to your job scheduler.
JSON file template
The template (job_scheduler_input.json) can be found in the Lumerical installation folder:
-
Windows: (default install path)
C:\Program Files\Lumerical\[[verpath]]\scripts\job_schedulers
-
Linux: (default install path)
/opt/lumerical/[[verpath]]/scripts/job_schedulers
Contents of "job_scheduler_input.json":
{
"user_name":"{your_username}",
"use_ssh":1,
"use_scp":1,
"cluster_cwd":"/{path/to/simulation/files/on/the/cluster}/",
"master_node_ip":"{head-node-ip}",
"ssh_key":"~/.ssh/id_rsa",
"path_translation": ["",""]
}Using the JSON file
You can use the JSON file by copying it to you user’s “home folder” and editing it, the home directory are shown below. Lumerical products will automatically read from it when launching a cluster job.
Note: Lumerical products looks for the exact file name, do not rename the JSON file after moving it.
Linux:
~/.config/LumericalWindows:
%APPDATA%\LumericalFields of the JSON file
The table below explains each field of the JSON file.
| Parameter | Description |
|---|---|
user_name |
Your user name on the master node. If left empty, the user name will be dynamically assigned using
|
use_ssh |
If set to 1, ssh* is used to run the submission command on the master node. |
use_scp |
If set to 1, the simulation files will be copied to the master node using scp*. |
cluster_cwd |
Shared folder where the simulation files will be copied to when use_scp=1.
|
head_node_ip |
IP address or hostname of the master node used for the remote connection and job submission. |
ssh_key |
Location of your private ssh key for passwordless connection. |
path_translation |
Can be used to translate the paths when there is a shared file system on Windows to a Linux cluster. Use unix-style path delimiters '/':
|
Important
To submit jobs from your local computer to a Linux cluster, set user_ssh and use_scp to 1. Set head-node-ip to the head node's IP address. This method will copy your simulation file to the remote server using scp*, launch the job with ssh*, and copy all the generated files back to your local computer at the end.
Note: On Windows, you need to ensure SSH and SCP are added to the system's PATH. Depending on Windows' version, you can install Git Bash or OpenSSH for Windows.
Results
You can now run any Lumerical simulation (single, sweeps, optimizations, etc.) directly from the CAD's Job Manager.