†University Institute of Pharmaceutical Sciences, Panjab University, Chandigarh 160014, India
Received: 22 Jul 2016 Revised and Accepted: 14 Feb 2017
Molecular simulation is increasingly used by medicinal chemists in the process and product development. Reliable computational predictions are of great value not only for the design of an active pharmaceutical ingredient with novel properties but also for the avoidance of an undesirable change of form in the late stages of development of an industrially important molecule. In the pharmaceutical industry, drug polymorphism can be a critical problem and is the subject of various regulatory considerations. This contribution tried to review the fuzzy frontiers between the chemical structure of the molecule and its crystal energy landscape with a particular focus on the crystal structure prediction (CSP) methodology to complement polymorph screening. A detailed application of CSP in the pharmaceutical industry is illustrated on ciprofloxacin; describing its putative polymorphs. This approach successfully identifies the known crystal form within this class, as well as a large number of other low-energy structures. The performance of the approach is discussed in terms of both the quality of the results and computational aspects. CSP methods are now being used as part of the interdisciplinary range of studies to establish the range of solid forms of a molecule. Moreover, further methodological improvements aimed at increasing the accuracy of the predictions and at broadening the range of molecules i.e. co-crystals, salts and solvates.
Keywords: Computational, Crystal energy landscape, Crystal structure prediction, Polymorph, Ciprofloxacin
© 2017 The Authors. Published by Innovare Academic Sciences Pvt Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4. 0/)
The prospect of understanding and designing the solid forms of organic molecules completely in silico is a tantalizing one. For over 25 y, crystallographers and computational chemists have faced the challenge of trying to predict organic crystal structures. However recent advances in molecular mechanics had made it feasible to predict the crystal structure from the chemical structure. The prediction of crystalline structure starting from nothing more than the connectivity of atoms within a molecule has been a great challenge for the modeling community  and the pursuit of such a computational procedure and a promise of its useful applications in the field of industrial pharmaceutical development has been a driving force for the development of increasingly accurate CSP techniques [2-5]. CSP programs were designed to find the crystal structure of an organic molecule, starting from the chemical diagram. However the existence of polymorphism, [6, 17] the appearance of different crystal structures containing only the same molecule, immediately shows that some crystal structures are not the thermodynamically most stable. Molecules often they crystallize first in a metastable polymorph, [1c, 18] which may appear to be stable because of difficulty in transforming to the more stable form. The phenomena of ‘‘late appearing’’ and ‘‘disappearing’’ polymorphs are probably linked to changes in crystallization conditions, such as new impurities, triggering the first nucleation of a new solid form and then the effect of seeds of this form on subsequent crystallisations . It was also suggested that approximately one-third of organic compounds and about 80% of marketed pharmaceuticals exhibit polymorphism under experimentally accessible conditions .
Thus CSP studies are useful in estimating the range of thermodynamically favored crystal packings. Moreover, it does not depend upon the kinetics of crystallization which can further vary on a wide range of controllable and less easily controlled crystallization conditions. CSP is already beginning to play an important role in understanding the organic solid-form landscape, as seen in some recent industrial examples from glaxo smithkline , eli lilly , pfizer  and others [13-15]. Pharmaceutical crystals can exist as single molecular entities or as molecular adducts . As single molecular entities, organic solids can show polymorphism.
Table 1: Different properties shown by various polymorphs
Polymorph property differences
•Molar volume and density
•Melting and sublimation
Molecular adducts include solvates and hydrates, which can be stoichiometric or non-stoichiometric in nature . The complexity of the organic solid state is being changed due to extensive research on polymorphism. Report has shown that there are only crystal structures of polymorphs for about 5% of the molecules in the Cambridge structural database (csd),  this reflects the difficulty of growing single crystals suitable for X-ray analysis rather than the incidence of polymorphism. Table 1 shows a list of different properties that can be shown by various polymorphs  that are very well documented in the literature.
A survey  from one of the polymorph screening companies of 245 molecules that they had screened, only 50% exhibit polymorphism whereas 90% had multiple crystalline and non-crystalline solid forms. This has led the use of CSP methods for calculating and interpreting the crystal energy landscape of a molecule . Thus CSP serves as a complement and direct stable form screening in order to assess the likelihood of missing a stable format early stages of drug development . Once a particular solid form is selected from the predicted ones, the crystallization and formulation conditions have to be controlled to avoid unwanted polymorphs. While practicizing this, researchers need to fully understand the structural aspects of each polymorph . Moreover, this knowledge is also important for patenting and registration purposes.
The most common method for determining a crystal structure is to grow quality crystals for single crystal X-ray diffraction. Growing single crystals of appropriate size, however, is often difficult or even impossible. Furthermore, one cannot be certain that all possible polymorphs have been discovered experimentally. Thus, methods that help to predict potential stable and metastable crystal packing arrangements from the knowledge of just the contents of the asymmetric unit along with other crystallographic parameters [24a] would be extremely valuable (fig. 1) [24b].
One of the methodologies, as coined by bayer et al., was presented in fig. 2. This contribution reviews the importance of CSP in early stages of drug development and the methodology involve in it.
Fig. 1: CSP using various pathways
In particular, a focus is given on concisely defining the basic crystallographic parameters and how they are helpful in computational polymorph screening from a learner’s perspective. Moreover, for comprehensive understanding, how to computationally screen various polymorphs, a study exploring the potential polymorphs of ciprofloxacin was also presented over here.
Fig. 2: CSP methodology coined by Bayer et al
Crystallographic parameters to be considered in CSP
The asymmetric unit
A CSP study also requires decisions on the extent of crystallographic space that needs to be covered for all the conformations. An important crystallographic property to consider in CSP is the number of molecules in the asymmetric unit, Z’ which can be below unity means that some of the crystal symmetry coincides with internal molecular symmetry.
Moreover, it is also greater than unity that is for a cocrystal, salt or solvate. In that case, the search space is increased by the number of variables (usually 6 for 2 independent molecules) need to define the relative position of the independent molecules within the unit cell. For single component organic crystals, it is usually restricted to Z’=1. This will reduces the number of search variables to those required by the space group. For a rigid molecule where no conformational variables in the most common space-group for organic molecules, P21/c, which has 4 molecules in the monoclinic unit cell, the only search variables are the 3 cell lengths, one cell angle β and six parameters defining the orientation and position of the molecule relative to the cell axes. But the restriction to Z’=1 would miss many polymorphs. So most of the crystallographers appreciated the importance of Z’>1 structures . Although some Z’>1 polymorphs are closely related to simpler Z’= 1 structures and are either trapped ‘‘crystals on the way’’ or may be lower temperature polymorphs, as in the case of polymorph of 7-fluoroisatin . Allowing for the possibility of variation in Z’ constitutes a problem in the design of prediction algorithms; many software codes do not allow cases of Z’≠1, and allowing for Z’>1 leads to a dramatic increase in a number of calculations required. 92% of all organic structures in the csd are reported  to have Z’=1, although, for example, alcohols are extra prone to crystallize with more than one molecule in the asymmetric unit. For the structures of polymorphic substances in the csd, gavezzotti and filippini  have found that 18% have Z’>1. The growth in both the incidence and the interest in high Z’ structures has been stimulated by improving instrumentation and complementary information from a range of techniques including single crystal and powder X-ray and neutron diffraction, solid-state NMR spectroscopy, computational energy calculations, DSC, and hot stage microscopy. A Recent review on high Z’ crystal structures  present a holistic overview of the recent startling progress in scientific understanding of crystal packing in general, and attempt to place the phenomenon of crystal structures with Z’>1 firmly into context within the contemporary view of crystal packing phenomena.
Force fields parameterization and evaluation
The importance of classical force field methods for calculations at the atomic level lies in the very fact that they are much faster than quantum mechanical (qm) methods. This is essential if either a large number of calculations have to be done or the system under study is very large. In addition to calculation speed, most applications require a high degree of accuracy of the calculated energies and structures. Most important example is polymorph prediction (pp)  that requires speed as well as accuracy. Thousands of trial crystal packing’s are predicted using pp algorithms whose energy must be minimised. Candidates for possible metastable polymorphs are lies within a narrow band of only a few kcal mol-1 above the global energy minimum. Therefore molecular mechanics is the only technique which today is sufficiently fast for modeling large and/or crystallographically complex systems. This involves a set of parameterized equations, a force field describing the potential energy of a model system. CSP can then be effected by generating a large number of starting structures, making sure to obtain a sufficient sampling of all the possible packing arrangements, and then optimising these with respect to the potential energy. The basic assumption in molecular mechanics is that the lattice energy is sufficient to consider as a descriptor of crystal structure stability. According to this the vibrational contributions to the internal energy which the zero point energy, entropy and the contribution to enthalpy from differences in density are all neglected. However, these contributions to the relative stability of organic polymorphs are found to be significant . These parameters of a force field have to be tuned so that the force field potential energy corresponds as closely as possible to the real case, for those types of molecules that the force field is designed to describe; a process called parameterization. Force fields can be more or less generic. With access to experimental data it is possible to develop tailor-made, ad-hoc force fields, specially parameterized for one or a few molecules, which can be remarkably successful, but limited in use . For most calculations, however, there are generic force fields available. These are often parameterized for large groups of chemically related molecules. The quality of a force field can be tested. With a perfect force field, the experimental structure (disregarding temperature effects) should correspond to a local minimum of the force field, but this is never completely the case. If the force field is applied to minimize the energy of an experimentally determined structure, the resulting structure is referred to as being relaxed. The difference, energetical as well as structural, between an experimental and the corresponding relaxed structure can be used as an indicator of the quality of the force field for a particular class of molecules. The various force fields that are most commonly used in the different studies include dreiding , COMPASS , CVFF , PCFF  and the universal  force fields. As confirmed by the outcome of the fourth, fifth and sixth cambridge crystallographic data centre (ccdc) blind test, the methodology for the parametrization of force fields on a molecule-by-molecule basis was a major breakthrough in the field of CSP [38, 39]. A successful intermediate between conventional force fields and periodic electronic structure calculations is based on the separation of inter and intramolecular energies and using distributed multipoles rather than point charges to evaluate the intermolecular lattice energy . Other, less empirical models based on the theory of intermolecular forces, with anisotropic atom–atom repulsion potentials, are being used successfully for CSP and property studies for rigid molecules .
Point charges centered on atom nuclei is an easy, simple way to obtain a description of the electrostatics of a molecule, intuitively fitting our chemical perception of molecules being composed of atoms. However, quantum mechanics teaches us that the molecular electrons are distributed around the nuclei according to the wave function, Ψ, creating an electrostatic field around the molecule. Hence, the point charge representation can be no more than a crude approximation. Furthermore, the columbic energy term has the nasty property of being a comparatively small sum of many large terms. This necessitates point charges of high accuracy. However, today the large majority of modeling software comes with point charge force fields; point charges are simple to deal with and result in fast calculations. Some force fields include their own built-in, or integral, charges, defined and optimised for each force field defined atom type, whereas other force fields require the user to supply point charges separately. Integral charges should result in a more robust force field, whereas the latter approach could theoretically lead to a better and more flexible description of electrostatic interactions. There are many techniques available for calculating point charges, and the choice of method should depend on the intended use of the charges. Two such methods that are often used are the method of gasteiger and marsili  and the charge equilibration method . The former depends only on the atoms and their connectivity, whereas the latter depends on molecular conformation as well.
Electrostatic potential charges (esp-charges)
Point charges to be used to describe interactions between adjacent molecules may be fitted to the electrostatic potential of the molecule, and these have repeatedly been recommended  for use in CSP. Two of the ways for fitting charges to the esp can be done either by choosing a set of charges at some relevant points around the molecule or maintaining the condition of electroneutrality. There are many ways to do this  and it is by no means as trivial as it might seem. However, these showed many problems with the robustness of esp-charges as fit for large molecules is a statistically underdetermined problem for which francl and chirlian  suggested a careful choice of the number of sampling points and their positions. In a study conducted by rasmuson , concluded that generic force fields and conventional point charges frequently result in a poor estimation of lattice energy, even for fairly simple organic molecules, and this may explain why ranking of polymorphs in CSP often appears to be incorrect. If esp charges are used for periodic simulations, they are derived by extracting a fragment of the periodic lattice and performing a molecular esp calculation [48, 49].
The prediction of crystal structures, besides a force field, requires some algorithms of generating candidate structures which involve a generation step and a minimization step where the candidate structures are optimised (relaxed) into a state of minimum potential energy with respect to structural coordinates. The generation step usually takes one or more molecular conformations as starting structure(s), and then, using either a random or systematic grid sampling approach, constructs the crystal. In one of the commercial implementation of CSP, software package materials studio from accelrys (biovia)–uses a sampling method somewhat dissimilar from purely random search methods, called montecarlo simulated annealing . The principle is that a starting point on the energy hypersurface is chosen at random, along with starting values of temperature and maximum allowed displacement. A metropolis random walk is then executed; if a trial step results in a lower value of the function to be optimized, the step is accepted, and if it results in a higher value, it is accepted with a probability of e-ΔC/kBT (the boltzmann factor), The temperature and the maximum displacement are gradually decreased, hence the term “simulated annealing”, until no more structures are generated.
The crystalline state is made up of molecules related by a combination of different symmetry elements, and there are 230 different mathematically possible such combinations, or space groups.
Table 2: CSD space group statistics–space group frequency ordering february 2015
|Rank||Space group no.||Space group||no. in csd||% of csd|
However an organic molecule, whose driving force for aggregation and packing is governed by a field of complex intermolecular interactions, only certain combinations of symmetry elements turn out to be feasible . Close-packing principle of kitaigorodsky , states that all molecular structures will tend to be as closely packed as possible. However, if the shape of a molecule does not permit a tight ordered packing, an amorphous solid result. Packing coefficient KP, describes the packing in crystals which is the ratio of the volume occupied by the molecules in the unit cell to the total cell volume. The intrinsic molecular volume may be computed by defining a cutoff value for the electron density. The ten most common space groups in the csd according to CSD space group statistics–space group frequency ordering February 2015 were given in table 2. A space group is given as the international table numbers together with the short hermann-mauguin symbol for the standard configuration.
Crystal structures comparison
Often, it is desirable to compare two crystal structures with respect to structural similarity between two crystal forms is performed in order to judge whether the two are virtually the same, as in the optional clustering step in many structure prediction algorithms or to compare more different crystal structures to find and quantify structural relationships. This is done by two ways in first one, the periodicity of structures by comparing some corresponding generated spectra, and the other involves comparing structural coordinates directly. The clustering method of the software materials studio is based on comparing radial distribution functions and works well for sorting structures into either similar or dissimilar. Other option that is available in the literature  involves comparison of simulated X-ray powder diffraction (xrpd) patterns which depends strongly upon the unit cell and is prone to change considerably with small structural alterations. However less marked similarity, a comparison can be difficult due to the crystal structures complexity. There is a definite lack of established methods for structural similarity comparison. For a simple, geometric comparison, an overlay of the structures, based on the least squares minimised difference in the atom coordinates, can be carried out  which is available as an in the software mercury. But as soon as there is an appreciable difference in packing the method fails to provide a good description of the difference. The crystal packing similarity tool in mercury was also used to compare the predicted crystal structures. It represents a crystal structure using a cluster of n molecules comprised of a central reference molecule and (n-1) nearest-neighbor molecules. The distances and a subset of the triangles that define the reference cluster are then used as 3-D substructure-search query within the comparison structure.
Putative polymorphs of ciprofloxacin
Ciprofloxacin (cp) is a synthetic broad-spectrum antibacterial compounds belonging to the group of fluoroquinolone (fq) antibiotics  and known for its efficacy against a variety of infections such as, respiratory tract, genitourinary, obstetric, gynaecological, skin and soft tissue infections. The accepted molecular mechanism of action of fqs would include their interaction with the enzyme-bound DNA complex that results in the inhibition of normal DNA gyrase or topoisomerase activity. To date, at least four structural models have been described for that interaction. The heteroassociation of cp with various salts such as anhydrous cp hydrochloride , cp hydrochloride monohydrate , cp hydrochloride 1.34-hydrate , cp lactate sesquihydrate , cp saccharinate , cp hemisodium salt pentahydrate, and cp sodium salt pentahydrate was described as well . Furthermore, co-crystals of cp with phenolic compounds, flavonoids, monoterpenes, amino acids, alkaloids, vitamins, and nutraceuticals have been prepared and characterized . The heteroassociation between ciprofloxacin and norfloxacin leading to cocrystal synthesis was also reported . A single crystal structure report of ciprofloxacin was also available in the literature . The hydrogel , nanostructured delivery  systems and microparticles  for cp have also been reported in recent past. Recently, reports on drug-drug salt forms of cp , cp hippurate salt  and cp maleate  polymorphs have also been communicated. However, no structural report on any polymorph of cp (fig. 3) is available till date. Moreover, the crystal structures landscape and the potential polymorphs of this clinically important antibacterial agent has not been even investigated and predicted. This instigated us to take up the challenge to predict the various potential polymorphs of cp.
Fig. 3: Molecular diagram of cp showing likely rotatable bonds
Molecular mechanics calculations were performed using materials studio® release 7.0 . For all solid-state simulations, the ewald method [72, 73] was used for the electrostatic and van der waals interaction terms. As the CSP method uses a rigid body approxi-mation in the initial search for crystal packing alternatives, it is necessary to perform an analysis to determine low energy geometry to be used as input for the packing calculations (for a detailed methodology see the supplementary data).
Full geometry optimisation was performed by using the smart algorithm in the forcite module of materials studio 7.0 (convergence tolerance energy of 2 x 10-5 kcal/mol, force of 0.001 kcal/mol/Å, and displacement of 1 x 10-5 Å with a maximum number of iterations of 500 for an independent optimization). The force field used was COMPASS. The optimized low energy conformation was re-optimized by a molecular dynamics simulation using the same module (simulation length of 5 ps with a 1 fs time step at a temperature of 298 K, taking conformations every 5000 steps). This optimized gas phase conformation was used as the starting point for crystal structure prediction using the materials studio polymorph predictor.
CSP of CP
The polymorph predictor was set to its default fine setting (this sets the simulated annealing algorithm to a temperature range of 300-100000.0 K with a heating factor of 0.025, requiring 12 consecutive steps to be accepted before cooling and a maximum of 2000 steps) with the force field COMPASS. The 10 most common space groups found in organic crystals registered in the csd were selected (P21/c, P1, P212121, P21, C2/c, Pbca, Pna21, Pbcn, Cc, and C2). Clustering of the predicted polymorphs was done using the polymorph clustering routine in materials studio.
The PP were repeated four times for each starting conformation to ensure adequate sampling of the crystal configurations. After a final clustering, the calculated crystal structures were ranked on the basis of their total energy. The unit cell parameters and packing energies for the 10 best-predicted structures are shown in.
CSP of flexible molecules involves consideration of the change in molecular geometry (bond lengths, bond angles and torsion angles) and intermolecular interactions simultaneously occurring during structure generation and minimization which is a computationally very intensive exercise and generally results in poor sampling of crystal structures.
To avoid these issues, rigid CSP runs (with input molecular geometry kept fixed during the structure generation and optimisation steps) were performed. The computational search (fig 4) found the experimentally observed polymorph, reproducing the structure well (fig. 5).
Fig. 4: Summary plot of the crystal energy landscape of cp. Each point represents a crystal structure
Fig. 5: a) Superposition of cp conformations observed in various predicted polymorphs and b) Overlay of (a) experimental form (colored by element) with CSP predicted same form (green). The RMSD15 of these overlays are 0.225 Å and 0.177 Å
Table 3: Unit cell parameters and packing energies, for the 10 best-predicted polymorphs of cp
|Z/Z'||Space group||α (Å)||β (Å)||γ (Å)||a ( °)||b ( °)||c ( °)||Density (g/cm3)||∆Epacking (kcal/mol)|
It is in the crystal energy landscape, the low energy region below about −95 kJ mol−1. Above this value, there are a plethora of crystal structures whose relative lattice energies are sufficiently high to suggest they are less likely to be viable experimental forms. Hence this crystal energy landscape is reasonably realistic. The crystal energy landscape shows that there are other structures (fig. 6) which are competitive in energy with the known forms, and this is generally consistent for the observation of polymorphism in future. The crystal structure of the least energy polymorph of cp reveals intra-molecular bonding (fig. 7c) between the acidic carbonyl and cyclic carbonyl. It had also showed an inter-molecular hydrogen bonding and the synthon involved was N—H----O (fig. 7b). The two fused benzene rings of cp molecule within the unit cell containing four asymmetric molecules are π-π stacked. Fig. 7a reveals the crystal packing diagram of the least energy polymorph structure. It obvious from the table 3, that the least energy forms on the landscape should vary from the actual experimental data but the data from CSP studies support crystal engineering by providing an understanding of the link between the motifs observed and molecular structure or crystal composition.
Fig. 6: Packing diagrams highlighting the supramolecular constructs of putative polymorphs of cp. Hydrogen atoms are removed for better visualization
Moreover, the ABC of crystal prediction for a person new in this field; starting from the word of CSP and the methodology involved was discussed. This helps understand for the one to know how computational methods can be used to complement experimental screening data. Recent success in the ccdc blind test 2016 for a highly flexible and drug-like molecule (XX) suggested it was timely to evaluate how CSP might fit into industrial solid form screening. The extension to cover crystals involving different types of molecules in the unit cell (e. g. co-crystals, hydrates and solvates) appears to be straight forward.
Fig. 7: (a) Packing arrangement of most stable predicted polymorph of cp. (b) Inter-molecular H-bond between piperazine–NH and carboxylic–OH. (c) Intra-molecular H-bond between adjacent carbonyl groups within the cp molecule
This article has been framed for a learner, to briefly explain basic methodologies and parameters of computational CSP along with a case study of the industrially relevant compound. The increasing computer power in the last two decades has made reasonable CSP searches possible for an increasing range of molecular systems, and there have been many successful predictions of organic crystal structures. CSP methods are now being used as part of the interdisciplinary range of studies to establish the range of solid forms of a molecule during drug development process. A careful CSP study can either reduce or expand the amount of experimental work needed to determine the full complexity of the solid form landscape of a molecule by either confirming that all practically important polymorphs are known, or suggesting that additional structures that should be targeted. The synergistic approach relating conformational and pp offers a simple and credible way to generate least energy conformer which is then further used to develop the crystal energy landscape. To conclude, an agreement between the CSP and experimental data gives a high degree of confidence in the risk assessment of the solid form selection. Otherwise, an additional polymorph screening to find the most stable form is likely to be required.
The authors gratefully acknowledge the financial support provided by University Grants Commission (UGC), New Delhi; vide letter number F.5-94/2007(BSR) to Pawanpreet Singh as UGC-RFSMS fellowship.
These authors contributed equally to this work
CONFLICT OF INTERESTS
The authors report no declarations of interest
How to cite this article