This article shows how to run parameter or optimization sweeps using the Lumerical-AWS integration Python module.
Requisites
- AWS account. If you do not have one, create and activate and AWS account.
- Lumerical-AWS integration configuration
- AWS S3 bucket with the simulation files you want to run. See, Using S3 bucket for details.
- Your license server and compute nodes are up and running. See, Managing instances for details.
Using S3
To run simulation jobs using the Lumerical-AWS Python integration module, simulation files has to be copied over to S3.
Copying to S3 from the local machine can be done in 2 ways.
- Using the AWS management console and add an Object to your bucket.
Files and Folders are called Objects in S3. - Using the AWS CLI, as shown here.
Example:
- Simulation file: "sweep_AR_coating_example.fsp"
- S3 bucket: "s3://lum-aws-demo-bucket/"
cd /<directory of simulation file>
aws s3 cp sweep_AR_coating_example.fsp s3://lum-aws-demo-bucket/
Launch nodes
The nodes will be created based on your settings from the Configure your compute instance section.
Command: lumerical.aws.start_compute_instances
>>> import lumerical.aws
>>> lumerical.aws.start_compute_instances(name='NAME', num_instances=NODE_COUNT, vnc_password='VNC_PASSWORD')
Where:
- NAME, VPC name created in the previous step.
- NODE_COUNT, is the number of compute nodes to launch.
- VNC_PASSWORD, optional and provides VNC password for individual access to the nodes. If left blank, it will take the Workgroup_ID as the VNC password.
- Each compute node is configured to checkout license from your license server.
- Each node will have the resource manager configured to launch jobs, referencing the IPs of all other nodes in the Workgroup created by the start_compute_instance command.
- All the nodes created will have a unique and shared Workgroup_ID tag.
Example:
Launching 5 compute nodes for the VPC, "aws-demo" and setting the VNC_password to myP@ssw0rd
>>> import lumerical.aws
>>> lumerical.aws.start_compute_instances(name='aws-demo', num_instances=5, vnc_password='myP@ssw0rd')
After creating and launching the nodes, the following information will be generated.
- Time Stamp, the time stamp when the nodes were launched.
- Workgroup_ID, the shared tag for all the nodes for the current 'Launch'.
- Number of nodes created.
- VNC_password, the password for the compute nodes for access via VNC.
Run sweeps
Running simulation jobs is similar as running them on your local machine. The difference it that now you are remotely logged in to on-demand EC2 instances and the project or simulation files are stored in your S3 bucket rather than the local hard drive.
Command: lumerical.aws.run_parameter_sweep
>>> import lumerical.aws
>>> lumerical.aws.run_parameter_sweep(name='NAME', workgroup_id='WORKGROUP_ID', s3_uri='S3_PATH_FILE', sweep_name='SWEEP_NAME')
Where:
- NAME, VPC name created in the previous step.
- WORKGROUP_ID, is the workgroup_ID Tag created when launching your compute nodes.
- S3_PATH_FILE, is the full S3 bucket URI, including the folder and the simulation (.fsp) filename.
- SWEEP_NAME, is the name of the parameter sweep in your simulation.
Example:
Run an optimization sweep.
- VPC Name: aws-demo
- WORKGROUP_ID: aws-demo-20180223-170903
- S3 bucket URI: s3://lum-aws-demo-bucket/sweep_AR_coating_example.fsp
- Sweep name: thickness_optimization
*This is the "sweep name" you defined in your simulation.
>>> import lumerical.aws
>>> lumerical.aws.run_parameter_sweep(name='aws-demo', workgroup_id='aws-demo-20180223-170903', s3_uri='s3://lum-aws-demo-bucket/sweep_AR_coating_example.fsp', sweep_name='thickness_optimization')
NOTE: After the simulation job is done, the script will automatically 'Terminate' all the nodes to avoid charges. The simulation log file will show if the job is done. |
Retrieve results
There are 3 ways to retrieve and copy over the results to your local computer.
- You can retrieve the files from your S3 bucket using AWS CLI.
cd <your working directory>
aws s3 cp --recursive s3://lum-aws-bucket/ ./ - Another method is to use the 'load' command from the script prompt within the product.
> load('s3://bucketname/filename.fsp');
- Lastly, you can use the S3 management console to download the files into your local machine.
Managing jobs and nodes
The nodes should have been 'Terminated' after the job completes. But if the nodes are not Terminated somehow even after you obtained and saved your results to your local machine, This can be terminated manually. We have 2 ways of removing or terminating the nodes to avoid recurring charges while these are not in use.
Killing simulation jobs
This will terminate all nodes with the corresponding 'Workgroup ID'. This action does not save any changes or results to your simulation file. This is similar to the 'Force Quit' button on the CAD/GUI resource manager. Nodes running under a different 'Workgroup ID' will not be affected.
Command: lumerical.aws.kill_job
>>> import lumerical.aws
>>> lumerical.aws.kill_job(name='NAME', workgroup_id='WORKGROUP_ID')
Terminating nodes
The command, "terminate_all_instances" will terminate ALL of your compute instances irregardless of Workgroup_ID, in the specified VPC. This will not save any changes or results to your simulation file.
Command: lumerical.aws.terminate_all_instances
>>> import lumerical.aws
>>> lumerical.aws.terminate_all_instances(name='NAME')
Where:
- NAME, VPC name created in the previous step.
- WORKGROUP_ID, is the shared workgroup_ID tag for your nodes.
Example:
- Killing simulation job on VPC, 'aws-demo' that has the Workgroup_ID, 'aws-demo-20180223-170903'
- VPC Name: aws-demo
- WORKGROUP_ID: aws-demo-20180223-170903
>>> import lumerical.aws
>>> lumerical.aws.kill_job(name='aws-demo', workgroup_id='aws-demo-20180223-170903')
See also:
Accessing individual nodes with VNC