Most of the available crystal structures of cryptochromes from various organisms are not complete, and in many cases only the amino acid sequence is available. The available part of the currently known cryptochrome structures includes the Photolyase Homology Region (PHR), but rarely the flexible C-teminal part (CCT) - this is a serious limitation, as the CCT is known to play a significant role in the functioning of cryptochromes. Additionally, the available structures have been crystallized, and thus removed from their natural environment, which could drastically affect their three-dimensional structure.
In order to compensate for this, we employ a variety of methods to predict the true structure of the proteins. The main method used is homology modelling, which we have used to construct a model of avian cryptochrome from just the amino acid sequence.
Homology models of proteins obtained directly from web servers, can rarely be used as reliable structures for further analysis, since these models are usually not stabilized. The stability of the homology model can be established and probed through molecular dynamics simulations by monitoring the root mean square displacement (RMSD) of the protein backbone during the MD simulation.
An example of the RMSD for a homology model of European robin cryptochrome 4 can be seen above. Here the RMSD stabilises with respect to the initial structure, and with respect to the equilibrated structure, indicating that the model is stable. In the figure, the inset shows that the RMSD increase is smaller if calculated relative to the post-equilibration structure, indicative of protein stability. The small peak in the RMSD of the equilibrated structure at 0.39 ms is due to the highly flexible terminal of the PHR domain, which starts as a helix but then unfolds and refolds into an alpha-helix during the simulation.
The model of European cryptochrome 4 is based on the amino acid template of mouse cryptochrome 1, which is 91% identical to European robin cryptochrome 4. The equilibrated model can be seen below, with the FAD cofactor highlighted in red and the conserved tryptophan triad shown in purple.
This shows that a stable homology model can be obtained, however, it is still only a model of the PHR domain.
Constructing the CCT
Usually the C-terminal part of cryptochrome, the CCT, is not included in the crystal structures and cannot be added in homology models (since no template is available for this part of cryptochrome), so in order to include the CCT in our simulations, we need to construct it - and we are currently working on this problem for Arabidopsis thaliana cryptochrome 1. In this case, the missing CCT consists of 219 amino acids, and the problem of constructing this part of the protein is a matter of folding it.
Protein folding is a known problem that is very hard to solve computationally - basically because there are so many possible conformations of the protein, that it is impossible to test them all in order to find the best one (unless the system consists of very few amino acids). Thus we have generated a number of "starting structures" more or less randomly, and through extended MD simulations the structures have been converging to more optimal structures. Since we end up with many structures, we can determine the best of our CCT structures through a variety of analyses (e.g. we would expect an "optimal" structure to have minimal energy).
Structures for CCT constructed this way does not necessarily resemble the "real" structure - it may not even be close - but it will most likely result in a much more realistic model for Arabidopsis thaliana cryptochrome than one missing about 32% of its amino acids, such as the currently known crystal structure.