The ICM-GB Project¶
The Interactive Chromatin Model (ICM) is developed by Dr. Tom Bishop at LATech. ICM is an interactive tool that allows users to rapidly assess nucleosome stability and fold sequences of DNA into putative chromatin templates. We recently collaborated on a NIH proposal to transform ICM into a more advanced high performance epigenetic analysis application. We propose to integrate existing bioinformatics tools (Genome browsers) and existing physics based 3D models of chromatin (ICM-tk) with distributed cyber infrastructure through science gateway technology.
A specific aim of this proposal, the web application interface which connects ICM, GB, and HPC, is called ICM-GB.
Our ICM-Tk will have no inherent ability to connect to isolated data sources with different data exchange protocols to retrieve data nor the ability to manage large amounts of data. To bridge between the generalized genome browser which has outstanding data accessing capability, and the ICM-Tk which will not, we will implement a web application gateway that connects the embedded genome browser to ICM-Tk for structure generation and use Jmol for 3D molecular display.
Planning for interface design¶
- Two layers of genome browser : first one is a normal browser to choose a segment of genome (0 to 10,000), second one is ICM specific, including 6 tracks of translation/orientation data from icm.par, 1 track of energy data from E.dat, and 1 track of folding position data from position.dat
- buttons: ‘make default’ button: start with default parameters, create a default position track. A corresponding position widget will be created for users to manipulate, then the ‘make custom’ button will send a new position data to server and fold with this position. For the interface, Tom wants for the user to be able to choose a nuc dat file for each position, which means the length of each nuc can be different ( right now 147 ). But this will not be used to fold the chromatin, so only for further use. Right now only the position (start, end) can be applied.
- default nuc lenght: 147. default minimum separation of each nuc (linker?) : 20 – can be changed in a global configuration panel later
- nuc widget: a slider, when choose a nuc dat file, can change length. can move around but not cross on top of each other, and maintain the minimal separation. Can also add/delete a nuc widget
- Temperature and Occupancy:
- Temperature: used in free DNA, temperature = 0 means very straight DNA
- Occupancy: decide how many nucs to put in the sequence, evenly distributed. If Occupancy is set to 0, then the nucs are packed as much as possible, means only leave minimal separatoin (20) between two nucs (147).
- computation on the server:
- run-icm.tcsh : use default parameters to create XYZ
- run-fold.tcsh : use input positions to create XYZ
- mkBigWigs.tcsh icm.par : create bigwig data from par data
- run-minimize.tcsh : minimize....
- files: (click ‘get all data’ to show the Files)
- icm.occ.xyz : folded XYZ
- E.dat : energies
- seqin.txt : sequence
- icm.par : folded helical parameter data
- position.dat : nuc positions – No this file doesn’t exist yet, think it’s because the nuc positions are just evenly distributed, controled by parameters such as occupancy?
- icm.dat : look at the third column, if it’s 0, then the dna is free, otherwise folded.