Computational chemistry stuff
1. Running Firefly on Amazon Web Services EC2 with StarCluster (possibly could be adapted also for GAMESS US)
Disclaimer: this is not a step-by-step guide, rather a bunch of hints of how to avoid many technical issues I personally stumbled upon trying to run Firefly on AWS cluster. For the beginning I strongly advise to look through a couple of excellent videos on StarCluster and its documentation:
AWS EC2 together with StarCluster software allows one to launch and run very cheap clusters of Linux machines using AWS spot instances. Spot instances are sublet for public by the owners of reserved instances who don't need them right now. Correspondingly the typical prices of spot instances are about five times lower than those for the on-demand ones. The drawback is that the spot instances could be reclaimed by the owner at any time. Even so, if your calculations will need few hours for completion, you could rely on these spot instances.
By default, Amazon lets you to run 20 on-demand instances or 100 spot instances, but these limits could be increased upon request.
A general overview of my cluster setup is shown on the picture below:
Ubuntu 12 StarCluster images (AMIs) come in three flavors: ami-7c5c3915 (32 bit); ami-765b3e1f (64 bit); ami-52a0c53b (64 bit with HVM), all supplied with corresponding version of OpenMPI. As long as Firefly is still a 32 bit program, we are limited to the first one, ami-7c5c3915, which can be launched into not very advanced instance (m1.small; m1.medium, or c1.medium).
All AWS instances are launched from the same customized image: I took ami-7c5c3915, added Firefly, Midnight Commander, and AWS SNS (Simple Notification Service), and saved it as my own custom AMI.
The limitation on the instance types could be avoided if someone manages to install a 32 bit MPI on a 64 bit StarCluster image without interfering with preinstalled 64 bit OpenMPI. I'm not geeky enough to do that, but somebody else might be up for the task.
Important! Firefly input files must additionally contain the following fragment:
$SMP LOAD=0 CALL64=0 $END
The necessity and meaning of these options were discussed on Firefly forum, and technical description of $SMP block could be found in Firefly manual.
And, just to make the system complete, the final part to add is the AWS SNS notifications upon job completion. SNS could be sent as e-mail or SMS to the subscribed recipients to take appropriate action, e.g. shut down the cluster and stop paying for it. I'm not going to discuss the installation of SNS in details, as it's pretty well described in the link above. The example of SNS messaging command is given in the job.sh script used to start Firefly jobs.
The launching script for firefly (e.g. job.sh) in my case is looking as following:
source /home/ubuntu/.profile #Needed for SNS to run
mpirun -hostfile hosts /home/ff8/ff8/firefly8 -r -f -p -stdext -i /home/ff8/ff8/jobs/121.inp -o /home/ff8/ff8/jobs/121.out -ex /home/ff8/ff8 -t /home/ff8/ff8/temp
sns-publish arn:aws:sns:us-east-1:388785348569:FF8 --message "Test 121 completed." #Message sent by SNS
The "hosts" file is made of list of nodes and number of computing slots. E.g., for a 4 node cluster with two processor each it will be:
All other options are described in the Firefly documentation. A separate EBS volume is used for permanent storage of your data, and is attached to the cluster through the StarCluster config file.
2.Cube2dx: Windows utility to convert Firefly cube files with electrostatic potential into dx format used by APBS.
Usage: cube2dx input_file
Input_file is a punch file containing Firefly cube data of ESP calculations.
Output: input_file.dx (APBS dx format for ESP) and input_file.xyz (simple xyz format for molecular structure). Both files could be opened in Chimera for further ESP visualization (see example below).
Example results of Firefly ESP calculations, converted to dx format, visualized in Chimera by Volume Viewer.
3. Sph2bild: Windows utility to convert sph files from DOCK sphgen to a bunch of spheres in Chimera bild format.
Even though Chimera can open and visualize sph files on its own, they cannot be made transparent. And I found it is often more informative to have the spheres of the binding site transparent and with definite color, like on the example below.
Usage: sph2bild input.sph <color> <transparency>
The optional color and transparency are to be set according to the bild format.
Example of binding site visualization in Chimera, with transparent spheres (the bound ligand is visible). Beige dots is the molecular surface of the receptor (hidden).