Most of the available crystal structures of cryptochromes from various organisms are not complete, and in many cases only the amino acid sequence is available. The available part of the currently known cryptochrome structures includes the Photolyase Homology Region (PHR), but rarely the flexible C-teminal part (CCT) - this is a serious limitation, as the CCT is known to play a significant role in the functioning of cryptochromes. Additionally, the available structures have been crystallized, and thus removed from their natural environment, which could drastically affect their three-dimensional structure.
In order to compensate for the missing protein structures, we employ a variety of methods to predict the true structure of the proteins. The main method used is homology modelling, which we have used to construct models of avian cryptochrome from just the amino acid sequence.
Since the crystal structure of some cryptochromes, such as those from European robin, are not known, the only way to study those proteins is to create such models computationally using an approach called homology modelling. Once a homology model is constructed, it needs to be equilibrated in an aquatic environment in order to obtain a stable structure that can be used for further investigations.
The most common way to evaluate the stability of a dynamical protein simulation, is to look at the Root Mean Square Displacement (RMSD) of the protein backbone. The RMSD is an indication of the change in structure, with respect to the initial configuration, a large RMSD means that the protein has undergone large structural changes, while a small RMSD indicates smaller changes, usually an RMSD around 3 Å is to be expected for a protein on the size of cryptochrome. An example of a small and large RMSD can be seen below.
For a homology model of European robin cryptochrome 1, the RMSD has been calculated over a 1 microsecond simulation, see below, and stabilises around 3 Å, which indicates that the model does not undergo any major rearrangements and that it has become stable after around 500 ns of simulations.
An example of the RMSD for a homology model of European robin cryptochrome 4 can be seen below. Here the RMSD stabilises with respect to the initial structure, and with respect to the equilibrated structure, indicating that the model is stable. In the figure, the inset shows that the RMSD increase is smaller if calculated relative to the post-equilibration structure, indicative of protein stability. The small peak in the RMSD of the equilibrated structure at 0.39 ms is due to the highly flexible terminal of the PHR domain, which starts as a helix but then unfolds and refolds into an alpha-helix during the simulation.
The model of European cryptochrome 4 is based on the amino acid template of mouse cryptochrome 1, which is 91% identical to European robin cryptochrome 4. The equilibrated model can be seen below, with the FAD cofactor highlighted in red and the conserved tryptophan triad shown in purple.
Once a stable model has been obtained, more advanced properties can be investigated; some specific analyses are overviewed below.
Constructing the CCT
Usually the C-terminal part of cryptochrome, the CCT, is not included in the crystal structures and cannot be added in homology models (since no template is available for this part of cryptochrome), so in order to include the CCT in our simulations, we need to construct it - and we are currently working on this problem for ryptochromes from different species; the problem of constructing this part of the protein is a matter of folding it.
Protein folding is a known problem that is very hard to solve computationally - basically because there are so many possible conformations of the protein, that it is impossible to test them all in order to find the best one (unless the system consists of very few amino acids). Thus we have generated a number of "starting structures" more or less randomly, and through extended MD simulations the structures have been converging to more optimal structures. Since we end up with many structures, we can determine the best of our CCT structures through a variety of analyses (e.g. we would expect an "optimal" structure to have minimal energy).
Structures for constructed CCT fragments are not expected to necessarily resemble the "real" structures - those may not even be close - but it will most likely result in a much more realistic models for cryptochrome than one missing significant part of its constituent amino acids.
Studying cryptochrome activation through molecular dynamics
Using Molecular Dynamics (MD), the time evolution of biomolecular systems, such as cryptochrome proteins, can be studied, on time scales up to microseconds. The time evolution of proteins give insights into properties such as activation mechanisms, binding of ligands, stability of structure predictions and other dynamic processes that occur on the microsecond time scale.
Investigation of the activation mechanism of cryptochromes
Once a stable homology model of cryptochrome 4 from European robin was obtained, a long molecular dynamics simulation of the structure was used to investigate the activation mechanism of the protein. The trajectory was analysed using both a principal component analysis and advanced graph theoretical methods, and these analyses revealed a network of few important amino acids in the protein structure, that facilitated the structural rearrangements resulting in the conformational changes caused by activation of the protein.
FAD binding in cryptochromes
Inspired by work done by Kutta et al. [Kutta et. al, Sci. Rep. 2017, 7, 44906] it was noticed that some amino acids are crucial for binding of FAD, especially a double mutation of arginine at the position 298 (R298), as well as glutamine at the position 311 (Q311E) in fruit fly (Drosophila melanogaster) cryptochrome reduced binding of FAD cofactor to protein for 50 %. In order to investigate binding of FAD cofactor in fruit fly cryptochrome, a series of mutated structures were created and simulated computationally [Sjulstok E. and Solov'yov I., J. Phys. Chem. Lett. 2020, 11, 3866−387]. When these two amino residues are mutated to glutamic acid (R298E-Q311E mutant) there is a change in R298 positioning relative to the FAD cofactor. This indicates that a double R298E-Q311E mutant protein inhibits a range of conformations, where R298 is on the surface which makes non-binding interactions with FAD less favourable.