Overview
This guide covers how to integrate the CAD Job Manager with some common job schedulers. After following these steps, running a simulation on a cluster will be just as seamless as running it on your local computer. You will be able to just click “Run” and the job will automatically be submitted to the job schedulers queue (the file will be transferred if necessary), the Lumerical Job Manager will periodically update with the simulation's current state and progress, and upon completion, the results will be automatically loaded in your current CAD session.
See also: Job scheduler submission scripts (SGE, Slurm, Torque) for additional details and sample submission scripts for the supported job schedulers.
Known Limitations
- The job manager options: Quit, Quit & Save, and Force Quit, are currently not supported.
- When using the file upload and download over SSH functionality, transfer status is not reflected in the Job Manager status view. The Job Manager window will remain open while the file is being downloaded in the background and close once this process is completed.
Requirements
- Lumerical Products 2020a R3 (or newer)
- Ansys Lumerical 2023 R2.2 (or newer) for user-based configuration
- Cluster with either: Slurm, Torque, LSF, SGE, X11-Display, Lumerical products installed with license server configured. If using AWS ParallelCluster, see AWS-ParallelCluster documentation for details.
Resource Configuration
-
Add/edit a new resource, and set the capacity to either 0 (inf) if the queue is already configured to handle the number of available licenses, or to the maximum number of available engine licenses on the license server.
- Edit the advanced resource settings, and select your job scheduler from the "Job Launching Presets".
Notes
- The preset is automatically populated with example settings and might not apply to your cluster.
- Please modify the submission command with the proper number of nodes and processes per node, as well as update the submission script for a Lumerical compute environment appropriate for your cluster and simulation needs.
- When submitting jobs using Slurm, please ensure flags to request resources such as CPU, GPU, and Memory on the compute cluster are set (e.g., through --cpus-per-task, --gpus-per-node, and --mem). Otherwise, the simulation may fail or may not use all available hardware. For more information regarding flags to request resources, please refer to Slurm documentation of sbatch.
Submitting jobs to your job scheduler from your local computer
Notes
- If you would like to launch jobs from your local computer for a more seamless experience, you can configure Lumerical on your local computer following the same steps as above and enable ssh by setting up the file "job_scheduler_input.json".
- The configuration with a JSON file was introduced with Lumerical 2023 R2.2 release.
- With previous versions, you will have to modify the .py file corresponding to your job scheduler.
The template (job_scheduler_input.json) can be found in the Lumerical installation folder:
- Windows: (default install path)
C:\Program Files\Lumerical\[[verpath]]\scripts\job_schedulers
- Linux: (default install path)
/opt/lumerical/[[verpath]]/scripts/job_schedulers
Contents of "job_scheduler_input.json":
{
"user_name":"",
"use_ssh":0,
"use_scp":0,
"cluster_cwd":"",
"master_node_ip":"<master-node-ip>",
"ssh_key":"~/.ssh/<private-key>.pem",
"path_translation": ["",""]
}
Important
- Copy the file, "job_scheduler_input.json" in your user's "home" folder:
- Linux:
~/.config/Lumerical
- Windows:
%APPDATA%\Lumerical
- Linux:
- Then edit the file with your job scheduler settings:
Contents of "job_scheduler_input.json":
{
"user_name":"",
"use_ssh":0,
"use_scp":0,
"cluster_cwd":"",
"master_node_ip":"<master-node-ip>",
"ssh_key":"~/.ssh/<private-key>.pem",
"path_translation": ["",""]
}
Parameter | Description |
---|---|
user_name |
Your user name on the master node. If left empty, the user name will be dynamically assigned using import getpass |
use_ssh | If set to 1, ssh* is used to run the submission command on the master node. |
use_scp | If set to 1, the simulation files will be copied to the master node using scp*. |
cluster_cwd |
Shared folder where the simulation files will be copied to when use_scp=1. "/cluster_path/of_simulationfile/" |
master_node_ip | IP address or hostname of the master node used for the remote connection and job submission. |
ssh_key | Location of your private ssh key for passwordless connection. |
path_translation |
Can be used to translate the paths when there is a shared file system on Windows to a Linux cluster. Use unix-style path delimiters '/': "path_translation": ["local_path/of_simulationfile/", "/cluster_path/of_simulationfile/"] |
To submit jobs from your local computer to a Linux cluster, set user_ssh and use_scp to 1. Set master_node_ip to the This method will copy your simulation file to the remote server using scp*, launch the job with ssh*, and copy all the generated files back to your local computer at the end.
Note: On Windows, you need to ensure SSH and SCP are added to the system's PATH. Depending on Windows' version, you can install Git Bash or OpenSSH for Windows.
Results
You can now run any Lumerical simulation (single, sweeps, optimizations, etc.) directly from the CAD's Job Manager.