Notes:
- Install Lumerical simulation suite on your cluster, preferably on a shared filesystem.
- Running multiple simulations across several computers simultaneously (concurrent computing), will require as many licenses as the number of computers running the simulations.
eg. #licenses = #nodes running jobs at the same time - Concurrent computing is currently supported by all products.
- Distributed computing is only available for FDTD and varFDTD.
Configure your firewall
- Many Linux clusters communicate across a private network and therefore firewall security may not be required. If no firewall is in use in your network, this step may be skipped.
- The MPI processes communicate using a range of ports. It's easiest to simply disable the firewall on all nodes. An alternate solution is to configure MPI to use a specific range of ports, then create exceptions for those ports. See your MPI's documentation for details.
- If you want to leave the firewall turned on, two additional firewall exceptions are required:
- In some configurations, MPI requires the use of the SSH programs to start remote processes on the compute nodes during parallel execution. Ensure that SSH port 22 is allowed to accept incoming TCP/IP connections on all of your compute nodes.
- Open/allow access to the TCP ports used by the license manager.
Shared network storage
When running a distributed job with MPI, the MPI process with Rank 0 is responsible for reading the project file from the disk. As long as you ensure that the localhost that has access to the simulation file is the first to appear in the host list, the MPI process with Rank 0 should be launched on that node. When running concurrent jobs (eg. sweeps), the files are created on the local machine, but the job (Rank 0) is launched on a remote node, and these nodes will need access to the simulation file. The simple solution to both these problems is to set up shared network storage (or when using AWS, using S3 storage). This shared storage must be accessible to all nodes under the same path or drive mapping. This will allow access to your simulation files from any node using the same path or Windows UNC pathname.
For example, you have the network storage location.
\\server\public\sims
- Mapping the above network path to the same drive letter on your Windows computers.
Drive X:\
- Mounting the network path under one location.
/mnt/shared/lumerical
On many Linux clusters/networks, each User's home directory is a networked file system and is common to all nodes. If this is the case you may use your home directory to store your simulation files. For more information on creating a network file system, see your operating system's documentation.
Configure login credentials
Windows
- Your user account should have a unique username and a password.
- We do not recommend using the default Administrator account on windows.
- When using Intel MPI, register "User" credentials as shown here.
On Windows, use Intel MPI as the Job launching preset. Microsoft MPI or Local Computer are used when running only on the local machine.
Linux
Configure your compute nodes to allow remote login without a password, as the version of MPICH2 included with the installation package uses SSH to start remote jobs. If this is not configured, the user will have to type their password each time MPICH2 is called to run the simulation.
Creating a passwordless SSH login
- On your primary computer, enter the following commands to create a set of ssh keys.
$ ssh-keygen -t rsa
- Press enter several times to accept all the defaults and an empty passphrase.
- This creates your public/private keys and saves them in your home directory
$HOME/.ssh
- We copy the keys to your authorized_key
$ cd ~/.ssh
$ cat >> authorized_keys < id_rsa.pub - Next, you must place your public key in the text file $HOME/.ssh/authorized_keys on each compute node
$ ssh <node name> "mkdir -p ~/.ssh; chmod 700 ~/.ssh"
$ cat ~/.ssh/id_rsa.pub | ssh <node name> "cat >> ~/.ssh/authorized_keys"
$ ssh <node name> "chmod 700 ~/.ssh/authorized_keys"
$ ssh <node_name>
On shared network file system
If your home directory is shared to all compute nodes, then you will only have to run the above command one time. Once you have completed this step, you should be able to log in to any of the compute nodes without entering a password.
Install Lumerical on a shared filesystem
See Shared filesystem installation on Linux for details.
Configure license
Refer to the "Configuring the License.ini file" section on this guide for details on setting up the global or system-wide license server configuration.
Configure resources
If you have a GUI connection to the cluster, you can run Lumerical simulation from the CAD/GUI and configure your resources depending on your use case.
- Open Lumerical CAD and open "Resources" to configure your resources.
- If you have a Job Scheduler installed on your cluster, see Job scheduler integration - resource configuration.
- Add or edit each resource as needed.