Key words

1 Introduction

Protein crystallography is a method to visualize biological molecules on an atomic scale. It can be regarded as a form of microscopy with very large magnification. In optical microscopy, the achievable resolution is similar to the wavelength of the illuminating light source. Visible light falls in the 400–700 nm range and thus the smallest objects viewable are also a few hundred nanometer in size.

The bond lengths between the most common atoms in biological molecules, C, O, N, and H, fall in the range of 100–150 pm. Thus to determine atomic positions the wavelength of the illuminating light source has to be comparable to the distances between atoms. Radiation with the appropriate wavelength is called X-ray radiation.

There are two main challenges related to the method of X-ray protein crystallography. First, protein crystals are usually small (10–1,000 μm) and have a high solvent content (around 50%), resulting in very weak diffracting power. Thus, the use of very intense X-rays becomes necessary. Second, no lenses, such as those used in a conventional visible-light microscope for magnification of the specimen, exist for X-rays of 100–150 pm wavelength. This fact alone necessitates the use of the method of crystallography, where diffraction from a periodic structure is used to form an “image” without a lens. The resulting difficulties of obtaining good protein crystals (1) and solving new structures (2) are the main challenges of the field.

The lack of X-ray lenses also complicates the delivery and focusing of the radiation onto the sample. Section 2 of this chapter describes the generation, manipulation and detection of X-rays as they apply to X-ray macromolecular crystallography. The two most common X-ray sources, namely, X-ray generators and synchrotrons, are explained in Section 2.1 followed by a brief overview of X-ray optics in Section 2.2. Components of the diffraction experiment are described in Section 2.3.

The goal of structure-based drug design (SBDD) is to determine the structures of medically relevant proteins in complex with compounds developed during the drug discovery process. These compounds go through numerous iterations as they progress toward the clinic. In an industrial environment, where multiple projects are being worked on simultaneously, with each of them likely having multiple compound series, the number of total complex structures needed can be rather high (hundreds per year). Taking into account an attrition rate of 5–25 between harvested crystals and crystals yielding an actual useful structure, the need for the handling of thousands of samples arises. This can be done efficiently only by using bioinformatics tools for sample tracking and automation for sample handling. The various practical aspects of high-throughput crystallography, including crystal screening, data collection and data processing are described in Section 3.

2 X-Rays

X-rays were discovered in 1895 by the German scientist Wilhelm Conrad Röntgen, who was awarded the first Nobel Prize in Physics in 1901 for his discovery. The term X-ray originates from Röntgen himself because at the time it was an unknown type of radiation. In many languages, including German, X-rays are called Röntgen radiation. X-rays are electromagnetic radiation, just like visible light and radio waves, and this was discovered by the method of crystallography. Max von Laue performed diffraction experiments on crystals using X-rays in 1912 and obtained photographs of diffraction patterns. These patterns are a result of interfering electromagnetic waves with a wavelength similar to the distances between atoms in a crystal. Laue received the 1914 Physics Nobel Prize for this discovery.

Electromagnetic radiation is characterized by its wavelength, λ and X-rays are positioned between vacuum UV radiation and gamma rays with a wavelength in the range of 10 nm to 10 pm. For radiation with a wavelength shorter than visible light, the wavelength is often expressed in terms of the photon energy E, and X-rays fall in the 0.1–100 keV range. Conversion between wavelength and energy can be done using a simple equation: E (eV)·λ (nm)  =  1,239.84. It is useful to remember that 1 ÅFootnote 1 wavelength corresponds to 12.4 keV photon energy, since it is a commonly used wavelength for protein crystallography at synchrotron light sources.

For protein crystallography monochromatic radiation is needed. In practice the energy of an X-ray beam is not infinitely well defined, and the term energy bandwidth (ΔE) or energy resolution (ΔE/E) is used to express the energy spread. A monochromator with a resolution of 10−4 has a bandwidth of 1 eV at 10 keV photon energy. The bandwidth is usually the full-width half-maximum (FWHM) value of a Gaussian distribution.

In addition to bandwidth, beam intensity is another important property of X-rays. It is measured in photons per second (ph/s) and can be accurately determined by a photodiode or an ionization chamber. Since protein crystals are small, it is desirable to focus the X-rays into a spot of similar size as the crystal (typically around 100 μm). However, strong focusing of an extended light source will result in a highly divergent beam behind the focal point and may impair the separation of diffraction spots on the detector surface. Especially for smaller crystals and/or large unit cells a low beam divergence is advantageous. Based on these practical considerations, a useful measure of beam intensity is the so-called brightness, which defines the beam intensity falling into a unit of solid angle and onto a unit of area (expressed as ph/(s mrad2 mm2) at a bandwidth of 0.1%).

In summary, the X-ray beam requirements for protein crystallography experiments are the following: i) high intensity, ii) monochromatic with a narrow bandwidth, iii) tunable energy for multiwavelength anomalous diffraction (MAD) phasing (2), iv) small focal point, v) low divergence, and vi) stability of the light source. Typical parameters of different X-ray sources are listed in Table 1.

Table 1 Typical parameters of synchrotron beamlines in comparison with a high-end beamline (APS ID-23D) (3) and a high-end X-ray generator (Rigaku FR-E  +  SuperBright™) (4)

2.1 X-Ray Sources

2.1.1 Physics of X-Rays

There are two fundamental physical phenomena that can lead to the emission of X-rays. Acceleration of charged particles generates a continuous spectrum of electromagnetic radiation, such as Bremsstrahlung and synchrotron radiation. Atomic transitions on the other hand result in the emission of photons with specific wavelengths and narrow bandwidth, such as the characteristic X-rays.

In practice, characteristic X-rays are generated by bombardment of a metal target (e.g. Cu, Al, Mo) with energetic electrons (e.g., 50 keV). These high-energy electrons knock out electrons from the inner-shells of the target atoms thereby ionizing them. This core-hole ionization process leaves the resulting ion in an excited state, which then relaxes to a lower energy state by filling the core-hole with an electron from a higher shell. The energy of the emitted photons is the difference of the electron binding energies of the shells between which the transition occurs. The description of the characteristic X-ray lines is based on using K, L, M, … for the main quantum numbers n  =  1, 2, 3, … of atomic shells and α, β, γ, … for the difference Δn  =  1, 2, 3 between shells. Thus, a K α line denotes an L  →  K transition and K β an M  →  K transition (see Fig. 1, left). In the case of copper, for example, the dominant K α line has an energy of 8,048 eV (1.541 Å), which is the difference between the binding energies of the K 1s and L 2p shells.

Fig. 1.
figure 1_5

Spectral distribution of X-rays from an X-ray generator (left, reproduced with permission) (5) and from bending magnet (BM) and wiggler (W) synchrotron radiation sources (right  ). The bending magnet and wiggler spectral distribution curves were calculated using (6) and the program SPECTRA (7), respectively.

When the acceleration of charged particles results in a change of the magnitude of their velocity, e.g., when energetic electrons hit a surface and slow down due to repulsive Coulomb interactions with the target atoms’ electrons, so-called Bremsstrahlung (“breaking radiation”) is generated. The energy distribution of Bremsstrahlung is continuous with the high-energy (short-wavelength) cutoff determined by the energy of the incoming particles (see Fig. 1, left). Radiation is also generated when the acceleration results in a change of the direction of the charged particles’ velocity, e.g., when charged particles are forced onto a circular trajectory by a strong magnetic field. Radiation created this way was first observed in a particle accelerator called a synchrotron in 1947 (8) and was subsequently named synchrotron radiation (SR). SR also has a continuous distribution (see Fig. 1, right), with the high-energy limit related to the energy of the accelerated particles (Equation 1).

2.1.2 X-Ray Generators

The basic layout of an X-ray tube is shown in Fig. 2. In an evacuated vessel usually made out of glass, electrons are emitted from a heated metal filament (e.g., tungsten), the cathode, and accelerated toward the anode by the high voltage (up to 150 kV) applied between the two electrodes. As the fast electrons hit the anode, Bremsstrahlung and characteristic X-rays are generated as described above and their energy will depend on the accelerating voltage and the anode material used. The X-rays exit through the glass wall and can be used for experiments. These devices are also called sealed tubes, since all the components are sealed into a glass tube (see Fig. 2, right).

Fig. 2.
figure 2_5

Schematic of a sealed tube X-ray generator (left  ) and photo of a Coolidge X-ray tube from the early 1900s (right  ) (9).

The production of X-rays is a very inefficient process and most of the electrons’ energy (∼99%) turns into heat at the anode. This heat load limits the maximum X-ray power possible in a simple sealed tube generator, even if the anode is water-cooled (see Fig. 2). To obtain a small X-ray source size the electrons are focused into a small spot on the anode (e.g., by shaping the cathode appropriately). This focusing increases the local heat load on the anode further.

The heat load problem is much reduced in rotating anode generators, where the anode is a metal disc rotating at high speeds (several thousand revolutions per minute). The target area being hit by the focused electron beam is constantly changing and helps to improve the dissipation of heat. This results in greatly increased performance but the higher technical complexity leads to higher acquisition and maintenance costs for rotating anode systems.

X-ray generators are often characterized by their power rating (watt or kilowatt). This is a measure of the electron beam power hitting the anode (accelerating voltage  ×  current between cathode and anode). Owing to the many factors influencing the usable X-ray power (focal spot size, anode shape, collected solid angle), a higher overall power rating does not necessarily correspond to a more intense X-ray beam at the sample.

X-ray generators fulfill the beam requirements for protein crystallography described above rather well: 1) Highest intensities can be obtained from rotating anode sources and those are indeed the most common devices for protein crystallography in home laboratories. They are well suited for crystals that diffract reasonably well. 2) A characteristic emission line has a specific wavelength with a narrow bandwidth; there are, however, multiple such lines emitted simultaneously, and they are all complemented by the continuous Bremsstrahlung background (Fig. 1, left). To obtain monochromatic radiation for a crystallography experiment, one of those lines has to be selected by using a monochromator or filter. 3) The energy of an X-ray generator is not freely tunable, but it can be changed to values of specific emission lines of different metals by changing the anode material. Copper is the most commonly used anode material (Cu K α at 1.54 Å), while the chromium K α line at 2.29 Å is emerging as an excellent source for Se and S SAD phasing in house (1012). An easy switch between different wavelengths can be achieved by employing stripes of multiple different elements on the same rotating anode (1012). 4) and 5) The electron beam can be focused into a spot smaller than 100 μm on the anode resulting in a small source size of the emitted radiation. However, the X-rays are emitted in all directions allowed by the anode geometry and have to be collected and focused onto the sample. The amount of focusing needed depends on how much solid angle of the radiation is collected and on the size of the beam in the focal point, where the sample is located. Capturing more solid angle (to increase flux) and/or focusing into a smaller spot (for small crystals) require stronger focusing, which leads to a more divergent beam behind the sample. Increased divergence means that spots closer together will overlap and cannot be resolved on the detector surface. This means there is a trade-off between flux, beam size at the sample, and beam divergence, and those parameters should be optimized for the task at hand. Many modern sealed tube and rotating anode generators are equipped with multilayer optics. These optics are capable of collecting a large solid angle of radiation and efficiently focusing it on a small spot. Additionally, they improve the spectral purity of the X-ray beam. Multilayers are discussed in more detail in section 2.2.2). 6) Beam stability is not an issue with modern generators as they can provide consistent output for extended periods of time (multiple days).

2.1.3 Synchrotrons

Although synchrotron radiation was discovered in the late 1940s, its actual utilization for scientific experiments did not start until the 1960s. Those first experiments were carried out at accelerators built for high-energy physics research and laid the foundation for the very successful expansion of the field. Owing to its broad bandwidth and high intensity (see Table 1), synchrotron radiation is utilized in many different fields of scientific research (physics, chemistry, materials science, biology).

Synchrotron radiation is generated in particle accelerators called synchrotrons (see Fig. 3). All modern light sources are built specifically for the generation of high-brightness synchrotron radiation and are not involved in high-energy physics. In a synchrotron, charged particles travel around a roughly circular path in a metal vacuum tube. The particles are kept on a closed trajectory by strong dipole bending magnets, which generate a magnetic field perpendicular to the plane of the accelerator and deflect the particles due to the Lorentz force (see Fig. 4). Quadrupole and hexapole magnets placed along the trajectory keep the particle beam focused because without focusing the repulsion between the particles with the same charge would “blow-up” the beam instantly. The diameter of the beam in the synchrotron is around 100 μm. The lower the emittance (beam size  ×  beam divergence) of the particle beam in the accelerator the higher the brightness the photon beam will have. The energy of the particles is boosted at every turn by a radio frequency (RF) accelerator cavity. To increase the energy of the particles, the magnetic field in the bending magnets and the frequency of the accelerating voltage in the RF cavity have to be raised in a synchronized manner and that is why this type of accelerator is called a synchrotron. The radius of the particles’ trajectory is fixed in a synchrotron, determined by the layout of the vacuum tubes and magnets. This is in contrast to the cyclotron, the circular particle accelerator invented by E. O. Lawrence (1939 Nobel Prize in physics), where the radius changes with increasing energy.

Fig. 3.
figure 3_5

Schematic representation of a synchrotron radiation facility showing the linac, the booster, and the storage ring. Image courtesy of the Advanced Light Source, Lawrence Berkeley National Lab.

Fig. 4.
figure 4_5

Dipole bending magnet of the Advanced Light Source Synchrotron (14). The coils of the magnet are made of water-cooled copper tubes.

The emitted power of the synchrotron radiation is proportional to m −4 (m—mass of the particle). For high-energy physics experiments, heavy particles (e.g., protons) are used to minimize the energy loss due to synchrotron radiation. Synchrotron light sources on the other hand use the much lighter electrons (or sometimes positrons) to maximize the amount of emitted radiation.

The final, highest electron energy is achieved in multiple steps (see Fig. 3). The electrons originate from the electron gun, which contains a thermionic material that easily releases electrons when heated (e.g., barium aluminate). The first step of acceleration happens in a linear accelerator (LINAC), where microwaves are used to increase the electrons’ energy along a straight path to the order of 100 MeV. The electrons are then transferred into the booster synchrotron, which accelerates them to the final, or close to the final, energy, which is typically between 2 and 8 GeV. From the booster the electrons are injected into the much larger storage ring, where electrons are stored and circulated for extended periods of time to generate light for the experiments. The intensity of the radiation is proportional to the number of particles in the storage ring, which is typically in the range of 100–500 mA (depending on the synchrotron). The storage ring is also a synchrotron (it can accelerate electrons within a limited range) and is specifically designed to produce intense photon beams as explained in the next two sections. At energies of a few GeV electrons travel with nearly the speed of light and relativistic theories have to be used for their theoretical description.

Although the vacuum system of the accelerator is kept at a very low pressure (∼10−10 mbar), there are still enough collisions between the electrons and residual gas molecules such that the beam will decay slowly. Usually, half of the electrons are lost within hours to days (depending on the synchrotron) and have to be replenished by injecting new particles into the ring. To avoid the unwelcome effects of beam decay, such as the constantly changing light intensity, most synchrotron radiation facilities changed over to “top-off mode” operation in recent years. In top-off mode, a small amount of electrons is injected into the storage ring about every minute, keeping the ring current basically constant resulting in a constant photon flux at the sample and a constant thermal load on the beamline optics.

2.1.3.1 Bending Magnets

As described in Section 2.1.1, acceleration of charged particles results in the emission of electromagnetic radiation. When the electrons circulating in the storage ring pass through a bending magnet they are deflected due to the Lorentz force and emit synchrotron radiation (see Fig. 5, left). The underlying physics is the same as for electrons oscillating in an antenna, thereby creating radio waves for broadcasting or communications.

Fig. 5.
figure 5_5

Synchrotron radiation emitted from a bending magnet (left, modified from http://hasylab.desy.de/science/studentsteaching/primers/synchrotron_radiation/index_eng.html) and an undulator (right  ). The opening angle of the cone depends on the ratio of the speed of the electrons v and the speed of light c for bending magnets. For undulators it is additionally influenced by the number of undulator periods N (λ u—period length of the undulator). The undulator image is courtesy of the Advanced Light Source, Lawrence Berkeley National Lab.

One of the major advantages of synchrotron radiation is its continuous energy spectrum, which covers a wide spectral range from infrared light to hard X-rays (see Fig. 1, right). Using a monochromator, the best photon energy can be selected for a specific experiment (e.g., matching excitation energies of atoms and molecules), providing unmatched flexibility for scientific research. The spectral distribution is described by the critical photon energy ε c, defined in a way that an equal amount of power is emitted above and below ε c. The critical energy depends on the radius r of the electron beam in the magnet (or the strength B of the magnetic field) and the particle energy E as follows:

$$ {\epsilon }_{\text{c}}=2.218{E}^{3}({\text{GeV}}^{3})/r(\text{m})=0.6651B(\text{T}){E}^{3}({\text{GeV}}^{3}).$$
(1)

Raising the critical energy to obtain hard X-rays needed for diffraction experiments can be achieved either by increasing the electron energy in the storage ring, which results in larger and more expensive machines (see Table 2), or by using stronger magnets. The latter, a significantly simpler and cheaper approach, was realized at the ALS, where three of the normal bending magnets with a magnetic field of 1.3 T were upgraded to 5 T superconducting magnets (13), raising ε c by a factor of 3 (see Table 2; Fig. 1, right).

Table 2 Basic parameters of three representative third generation synchrotron radiation facilities

Another important feature of synchrotron radiation is that it is emitted in a narrow cone in the forward direction. This is due to the relativistic nature of the electrons and provides an inherent primary focusing of the photon beam. The angular width of the cone is 1/γ, and it decreases with increasing electron energy (see Fig. 5).

Radiation from a bending magnet is linearly polarized in the plane of the accelerator. Above and below the plane, the radiation is circularly polarized with opposing helicities respectively. The variability of the light polarization is yet another very useful aspect of synchrotron radiation, which is exploited in many experiments.

2.1.3.2 Insertion Devices (Wigglers and Undulators)

The amount of light emitted from a bending magnet is limited by the electron current in the storage ring and the acceptance angle of the beamline optics (i.e., how much of the radiation cone can be utilized). Since both of these quantities are limited by practical constraints, and because scientists typically need more flux, new devices had to be developed. These devices, called wigglers and undulators, consist of a series of magnetic dipoles with alternating polarity, which cause the electron beam to “wiggle” or “undulate” back and forth around a straight line and emit synchrotron radiation at every deflection (see Fig. 5, right). Wigglers and undulators can be several meters long and are installed in straight sections of the storage ring (see Fig. 3). Their magnetic field is usually perpendicular to the accelerator plane, and thus, the electrons’ sinusoidal motion is in the horizontal plane.

A wiggler is a periodic structure of multiple bending magnets with its magnetic field being relatively large resulting in large deflections of the electron beam (as compared to the opening angle of the radiation cone). The emitted light from subsequent poles adds up incoherently and the intensity proportionally increases with the number of periods N. The photon energy distribution is continuous (see Fig. 1, right) and the total emitted power can be significant (several kilowatt). This requires special precautions to avoid damage to the optical components of a beamline.

The magnetic field in an undulator is relatively weak and the deflections of the electron beam are within the opening angle of the radiation cone. The radiation adds up coherently and the resulting interference produces a spectrum where certain wavelengths are amplified while others are suppressed (see Fig. 6, left). The peaks are called harmonics and their intensity scales with N 2. This results in a much higher achievable flux than with bending magnets or wigglers. The central cone of undulator radiation is narrower (see Fig. 5, right) than that of bending magnets and wigglers, which also contributes to their higher brightness. In fact undulators are currently the brightest sources of X-rays available (see Fig. 6, right). The photon energy of the undulator harmonics depends on the electron energy, the period length of the magnetic structure λ u (see Fig. 5, right) and the strength of the magnetic field. To adjust the energy of the harmonics, the magnetic field strength is changed by mechanically adjusting the gap between the top and the bottom arrays of magnets because most undulators are made of permanent magnets. The photon bandwidth of the first harmonic is given by Δλ/λ  =  1/N. This bandwidth is not narrow enough for most experiments, and thus, undulator radiation has to be monochromatized further. For large energy changes the monochromator settings and the undulator gap are adjusted together to retain maximum flux.

Fig. 6.
figure 6_5

Left: Spectral distribution of synchrotron radiation from an undulator source (calculated using the SPECTRA program (7) for an ALS U5 undulator). Right: Brightness of bending magnet, wiggler and undulator sources of the ALS and APS storage rings as compared to X-ray generators. Image reproduced from the “X-ray data booklet” (15), with permission.

2.1.4 Future Sources

Naturally, both X-ray generators and synchrotron radiation sources are steadily evolving and improving in performance. Better cooling, higher reliability of operations, lower maintenance, tighter focusing of the electron beam onto the anode, and better focusing of the X-rays are the main areas where advances have been significant for X-ray generators. At synchrotron radiation sources the introduction of the top-off mode resulted in higher brightness photon beams with constant intensity and continuous operations. Stable, high intensity beams a couple of microns in diameter are now possible. In fact, at many modern beamlines the major limiting factor is radiation damage of the samples and not the intensity or other properties of the X-ray beam.

Among the more revolutionary developments in recent years are two new X-ray sources, one shrinking the synchrotron into a room sized device and the other pushing brightness into new territories. Both of these light sources will no doubt have a major impact on X-ray science, albeit in very different ways.

The Compact Light Source (“desktop synchrotron”) is being developed by Lyncean Technologies, Inc. in Palo Alto, CA. The basic idea is to replace the conventional undulator with a laser beam (see Fig. 7). The electrons and the laser beam move in opposing directions and the electromagnetic field of the laser acts like an undulator with a very short period length and a large number of periods (20,000). If the laser beam wavelength is 1 μm, 1 Å X-rays can be produced with only 25 MeV electron energy. This dramatic drop in electron energy enables the reduction in size of the accelerator to fit into a home lab. The properties of the emitted radiation are similar to those originating from a regular undulator. The peak photon energy of the undulator harmonics can be changed by tuning the electron energy in the accelerator (since there are no magnets involved there is no gap to change). The X-ray beam can be further focused and monochromatized with standard beamline components used at large synchrotrons.

Fig. 7.
figure 7_5

Schematic drawing of the Compact Light Source (CLS). Image reproduced with permission from Lyncean Technologies (16).

The other new development is the X-ray laser. Lasers had a major impact on many facets of technology during the past decades. They produce intense, highly collimated, monochromatic, and coherent radiation from the infrared to the ultraviolet range of the electromagnetic spectrum. Achieving these properties at much shorter wavelengths would provide the ultimate research tool for scientists. In a traditional laser, a resonator cavity is made up of an amplifying medium (gas or solid) enclosed between two mirrors. Light travels back and forth between the mirrors many times and in each pass gets amplified by stimulated emission from the medium. Owing to the absorption of X-rays by matter such a laser setup cannot be realized for photons in the kiloelectronvolt range. It is possible, however, to achieve amplification without excitation of atoms or molecules, by utilizing interactions between free electrons and light. Such a device is called a free electron laser (FEL). Furthermore, the amplification has to happen in one pass, because of the lack of suitable mirrors for X-rays. This can be realized by ensuring a very long interaction region between the electrons and the radiation inside a very long undulator. The technical challenges in building an X-ray FEL are enormous due to the extremely tight tolerances and high precision required of all components (e.g., mechanical alignment of components, electron beam properties, magnetic fields).

The first X-ray FEL to become operational is the Linac Coherent Light Source (LCLS) at the Stanford Linear Accelerator Center (SLAC). A LINAC of 1 km length accelerates electrons to ∼14 GeV energy, which then pass through 33 undulators over a 120 m distance (see Table 3) to produce X-rays of ∼8 keV energy. These X-rays are coherent, a billion times brighter than any other source and have a short pulse length of ∼100 fs. The high brightness might allow the determination of the structure of single molecules without the need for crystals (17). The short pulse duration should help overcome radiation damage by generating an image before damage can occur. If such experiments can be realized in the future, they could shed light on the structure of macromolecules that are difficult or impossible to crystallize.

Table 3 Basic parameters of the ALS wiggler, an APS undulator, and a single undulator of the Linac Coherent Light Source (LCLS) (33 of these are combined for the free electron laser (FEL))

2.2 X-Ray Optics

Methods of generating X-rays are described in the previous sections. To perform an actual diffraction experiment on a small macromolecular sample the photons have to be delivered and tightly focused onto the crystal while minimizing the loss if intensity. The photon beam also has to be monochromatized because even undulator radiation has too large a bandwidth to be used directly. The devices used to focus and monochromatize X-rays are presented in the following sections.

2.2.1 Mirrors

Lenses for visible light are based on refraction: when light passes between materials with different indices of refraction (e.g., between air and glass) their path is deflected according to Snell’s law. The index of refraction of all materials is ∼1 at X-ray wavelengths and thus a refractive lens thin enough to transmit most X-rays cannot be realized for practical purposes. The index of refraction of vacuum is exactly 1 for all wavelengths, whereas for X-rays it is slightly below 1 for all materials. Thus, if X-rays hit a surface at a small grazing angle (i.e., almost parallel to the surface), they experience a transition to a lower index of refraction and will be reflected. This phenomenon is called “total external reflection” in analogy to total internal reflection for visible light (e.g., when light traveling in water reflects off the water–air boundary). Very high reflectivities can be achieved (see Fig. 8, left) if the grazing angle is sufficiently small and the right material is chosen.

Fig. 8.
figure 8_5

Left: Reflectivity of mirrors at an incidence angle of 0.1° and a surface roughness of 1 nm coated with different elements. The reflectivity curves were calculated using (18). Depending on the photon energy the appropriate coating material is chosen. Right: Two X-ray mirrors (each 900 mm long  ×  51 mm wide) of the ALS sector 5 beamlines 5.0.1 and 5.0.3 (19). The mirrors are made of Si and are coated with Rh/Pt. They are shown before being bent to a cylindrical shape.

As a consequence of the low grazing angle the mirrors have to be very long. At 0.1° angle an incoming 1 mm2 diameter parallel beam illuminates a 1  ×  573 mm2 area on a flat surface. To capture the generally divergent incoming beam, X-ray mirrors are usually a few cm wide and up to 1 m long. Depending on the purpose of the mirror, its shape can be planar to simply deflect the beam, cylindrical to focus in one dimension or toroidal to focus in two dimensions (see Fig. 8, right). There are a number of technical challenges in the manufacturing, metrology and operations of X-ray mirrors. They have to be made of materials that can be machined to very high precision and that have a low thermal expansion (e.g., silicon carbide, Si, fused silica). Cooling channels inside the mirror are often necessary to compensate for the large thermal load from the source, which can distort the mirror surface leading to defocusing of the beam. The surface roughness has to be on the order of a few Ångstrom r.m.s. to maintain high reflectivity.

A cylindrical or other shape of the mirror along the path of the X-rays is achieved by bending the whole mirror with a mechanical system. The bending radius can be as large as several kilometers and the measurement of such a small deviation from planarity requires special metrology based on laser interferometry. When in operation as part of a beamline, mirrors are mounted inside vacuum vessels and have to be remotely adjustable in position, angle and often bending radius with micrometer precision to align and focus the beam onto the sample.

2.2.2 Monochromators

To avoid marked broadening of the diffraction spots, a monochromatic X-ray beam with a resolution of ∼0.1% or better is needed for protein crystallography experiments. However, the X-rays generated by the sources described in Section 2.1 have a broad energy bandwidth for Bremsstrahlung, bending magnet and wiggler radiation, and an energy resolution of a few percent for undulator beams. Some characteristic X-ray lines can be very close to each other, e.g., the Cu K α1 and Cu K α2 lines are 20 eV apart and have widths of 2.3 and 3.3 eV, respectively (20). As a result, the radiation from all these sources has to be monochromatized, either to increase the resolution or to separate nearby emission lines (and also to separate them from the Bremsstrahlung continuum).

2.2.2.1 Crystal Monochromators

For hard X-rays monochromators are made of crystals with an appropriate lattice constant. Silicon is the most commonly used material for several reasons: very high quality crystals can be obtained in the sizes necessary, it has a low thermal expansion coefficient and thus deformation is limited under the heat load of radiation, Si(111) and Si(220) have precisely known and suitable lattice constants for the required X-ray wavelengths (Si(111): 2d  =  6.27 Å, Si(220): 2d  =  3.84 Å).

The interference between scattered waves of radiation is the basis for diffraction experiments. Waves that are in phase (with a phase shift of n2π, where n is an integer) will constructively interfere. This phenomenon is described by Bragg’s law:

$$ 2d\mathrm{sin}\Theta =n\lambda,$$
(2)

where λ is the X-ray wavelength, Θ is the angle of incidence, and d is the spacing between atomic layers (see Fig. 9). Based on Bragg’s law, the wavelength of the monochromatized radiation can be changed by changing the angle of incidence (in practice by rotating the crystal). In case of a single crystal monochromator, rotation also moves the diffracted beam and the experimental setup has to be moved accordingly. For some applications this might be acceptable, but for synchrotron beamlines such a setup is very cumbersome (although it exists) and thus single crystal monochromators are used primarily for fixed-energy beamlines (see Fig. 10). To avoid the motion of the monochromatized beam described above, a double crystal monochromator can be used (see Fig. 11). In this configuration, the beam is bounced between two parallel crystal surfaces, which are rotated together. The incoming and outgoing beams stay parallel during rotation and the outgoing beam stays in the same position. This enables the delivery of a monochromatic beam with tunable wavelength to the same fixed point in space. Parallelity between the two crystals can be ensured either by manufacturing them with high precision from the same block of material (called a channel-cut crystal) or by making the second crystal slightly adjustable. In the so-called sagittally focusing double crystal monochromator, the second crystal is bent perpendicularly to the X-ray beam thereby also focusing the beam. The energy resolution of beamlines based on Si(111) double crystal monochromators is typically 3–8  ×  10−4.

Fig. 9.
figure 9_5

Reflection of X-rays on a multilayer structure based on Bragg’s law (22). Image reproduced from the “X-ray data booklet” (15), with permission.

Fig. 10.
figure 10_5

Schematic layout of the optical components of the three sector 5 beamlines at the ALS. Reprinted with permission from ref. (24). Copyright 1995, American Institute of Physics. The assembly containing mirrors M1 and M2 of the monochromatic side stations 5.0.1 and 5.0.3 is shown in the right of Fig. 8.

Fig. 11.
figure 11_5

Channel-cut Si(111) double-crystal monochromator with different incidence angles resulting in monochromatized radiation of different wavelengths (21). Image reproduced with permission.

2.2.2.2 Multilayers

Both X-ray mirrors and crystal monochromators have some inherent limitations as described above in Section 2.2.1 and “Crystal Monochromators.” X-ray mirrors can achieve a high reflectivity only at small grazing angles (∼0.1°) resulting in a large reflecting surface. For crystal monochromators, the usable energy range, the incidence angle and also the energy resolution depend on the lattice constant d (Equation 2), which is an intrinsic property of the material being used. It would be very beneficial for scientific research if the monochromator/mirror properties could be “tuned” to match the requirements of the experiment at hand. Optical components based on multilayers offer this kind of flexibility for a variety of applications (22).

Multilayers consist of alternating layers (tens to hundreds) of high and low-Z materials with thicknesses of a few nm (see Fig. 9). The thickness of a double-layer corresponds to the lattice spacing d in a crystal and the reflection of light from a multilayer is also described by Bragg’s law (Equation 2). Depending on the materials used, the period length, the ratio of the two materials within a double-layer and the number of periods, multilayers can be manufactured for a wide energy range (13 eV to 21 keV (23)). Multilayers act as mirrors and monochromators at the same time, with a bandwidth in the 0.1–10% range and reaching reflectivities close to 100%. In the X-ray region, the main focus has been the development of high-reflectivity multilayers for Cu K α radiation (∼8,050 eV) with tungsten often used as the high-Z material and boron carbide (B4C) or silicon as the low-Z material (23). For example, a W/C multilayer with 200 layers of 3 nm thickness has a reflectivity of ∼80% and a bandwidth of ∼3% at 1.5° grazing angle (23).

New X-ray generators for crystallography are equipped with multilayer optics to efficiently collect and focus a large solid angle of radiation from the anode and to suppress neighboring emission lines and the continuous background at the same time. In this application, multilayers are often further optimized by depositing them on a curved surface for beam focusing and varying the layer thickness either laterally or in depth. At synchrotron beamlines multilayers are used instead of crystal monochromators when high flux is needed and low bandwidth is sufficient (e.g., for small-angle X-ray scattering experiments).

2.3 Beamline

The purpose of a beamline is to deliver a focused and monochromatized X-ray beam to the sample from the source inside the storage ring. The beam properties at the sample are therefore a combination of the source parameters and the beamline specifications. A typical beamline consists of one or more X-ray mirrors, a monochromator, beam diagnostics (such as intensity and position monitors), and beam defining slits. All components are installed in vacuum vessels and are connected by vacuum tubing to minimize absorption of the beam and damage to the optical components. The beamline is usually separated from the main storage ring vacuum system by a Beryllium window (Be is practically transparent to hard X-rays).

Some technical feats in the construction and operation of a synchrotron beamline are transparent to the end user. All optical components, and the vacuum tanks they are mounted in, have to be installed in the right location with micrometer precision requiring sophisticated metrology. To optimize the beam characteristics once the beamline is operational, the optical components have to be adjusted remotely with high precision, reliability and reproducibility. The deformation of mirrors and monochromators due to the heat load has to be minimized by efficient cooling systems, which sometimes use liquid nitrogen as coolant. Lastly, to provide a stable and optimal beam for an extended period of time to perform the actual diffraction experiment, active feedback systems have to keep the different components in alignment.

Two typical beamline designs are illustrated in Fig. 10. A wiggler source at sector 5 of the ALS provides a wide fan of radiation feeding three separate beamlines (25). One of them (BL5.0.2 in the middle) is an adjustable wavelength MAD beamline consisting of a cylindrically bent prefocusing mirror, a cryogenically cooled Si(111) double crystal monochromator and a toroidal refocusing mirror. This mirror–monochromator–mirror design is common for MAD beamlines, although there are many variations in the details. It gives good performance and adjustability without losing too much beam intensity. The other two beamlines (BL5.0.1 and BL5.0.3) are monochromatic, each consisting of a single cylindrical mirror and a single cylindrically bent Si(220) monochromator crystal. Such a simple design is possible by using both optical elements also for focusing covering both the horizontal and vertical planes. Details of the current performance of the ALS sector 5 beamlines can be found in (26).

All the X-ray beam requirements for protein crystallography listed above are very well fulfilled by modern synchrotron beamlines, as can be seen from the typical parameters of beamlines listed in Table 1. An up-to-date list, including detailed specifications, of structural biology synchrotron beamlines worldwide can be found on the Biosync Web site (27). More details about macromolecular crystallography beamlines can be found elsewhere (28).

2.3.1 Experimental Station (Endstation)

2.3.1.1 Overview

The section of the experimental setup where the diffraction experiment takes place is called the endstation. A strict separation between the beamline and the endstation is not possible due to the proximity of the components. The main optical elements (mirrors and monochromator) and their supporting systems can be considered part of the beamline, while the components near the sample comprise the endstation.

Owing to the health hazards of hard X-rays and their weak absorption by air, the endstation is located inside a radiation-shielded hutch. The hutch is separated from the rest of the beamline by a beam shutter, which can be opened through a series of interlocks only when nobody is inside. To avoid frequent entries into the hutch, which can be rather time consuming, most of the endstation components can be remotely controlled from the outside. An overview of the ALS beamline 5.0.3 endstation is shown in Fig. 12. A description of the main parts is given in the following sections.

Fig. 12.
figure 12_5

Overview of the BL5.0.3 endstation at the ALS. The monochromator crystal is mounted in the back of the hutch and the X-rays are moving toward the charge-coupled device (CCD) detector. A detailed description of this endstation can be found in ref. (29).

2.3.1.2 Fast Shutter, Collimator, Slits, Beamstop, Intensity Monitor

The final steps of beam manipulation happen in the endstation. A fast shutter is used to accurately control the exposure time during which the crystal is exposed to X-rays while it is being rotated through the required oscillation angle. It is located upstream of the collimator. A precise coordination of the goniometer rotation and shutter actuation are required to obtain high quality data.

To decrease the beam size beyond the focusing capabilities of the beamline optics, collimators with fixed apertures (e.g., 100 μm diameter) or adjustable slits are used. Fixed diameter apertures are small metal discs with high-precision, laser-drilled holes. A scatter guard is mounted downstream from the collimator to block the X-rays that scatter off the edges of the collimator aperture. The scatter guard is also a metal disc or a metal tube with an opening larger than the collimating aperture (30). The collimator–scatter guard assembly is located close to the sample (∼1 cm), leaving just enough room for mounting/dismounting of crystals while not disturbing the flow of the cold stream.

While the collimator decreases the beam size at the sample, the beam size downstream of the sample (i.e., the size of the diffraction spots) can be reduced by decreasing the beam divergence. This might be necessary when the spots overlap on the detector surface due to a large unit cell or high mosaicity. Beam divergence is adjusted with slits upstream of the collimator and typical values are in the 1–3 mrad range (see Table 1).

A beamstop is installed downstream of the sample to block the direct beam from striking the detector. Since the direct beam is orders of magnitude more intense than the diffraction spots, not blocking it (or not entirely blocking it) would generate background radiation that make diffraction measurements impossible and could possibly damage the detector. The beamstop is usually a metal disc with the proper dimensions and absorption characteristics for a given beamline. Some beamstops let a small portion of the direct beam bleed through, which can be helpful to determine the accurate beam position on the detector surface.

To optimize the beam intensity for the diffraction experiment, it has to be measured close to the sample position and downstream of the collimator. A retractable photodiode downstream of the beamstop is often used. Since this requires the removal of the beamstop from the beam and insertion of the diode into the beam, it cannot monitor the intensity during data collection. To overcome this problem, Ellis et al. (31) designed a beamstop with an integrated photodiode, which is shown in Fig. 13.

Fig. 13.
figure 13_5

Sample area on beamline 11-1 of the SSRL. Reprinted with permission from ref. (31). Copyright 2003, International Union of Crystallography.

2.3.1.3 Goniometer and Crystal Positioning

The goniometer is the endstation component that holds and rotates the crystal in the X-ray beam during data collection. Single-axis goniometers are most common, where the rotation axis is perpendicular to the X-ray beam and usually lies in the horizontal plane. Rotation is provided by an electric motor coupled with a high precision bearing. Modern beamlines use so-called air bearings, where a thin layer of pressurized air keeps the fixed and moving parts separated. As a result, a highly variable rotational speed, high angular resolution, and a low circle of confusion can be achieved (0.01–360°/s, 0.00005°, and <1 μm, respectively, on BL5.0.3 at the Advanced Light Source (29)). The circle of confusion defines how precisely the sample can be kept in one place during rotation and it becomes increasingly more important as crystal and beam sizes decrease.

To remotely center the sample in the X-ray beam, a small motorized positioning stage (x,y,z or kinematic) is mounted on the goniometer (see Figs. 12 and 13). This allows movement of the crystal in all three dimensions by a few millimeter. It is controlled remotely through a point-and-click graphical user interface, which, in conjunction with the fast goniometer rotation, allows the crystal to be centered within a few seconds. A magnet mounted on the goniometer shaft holds the sample in place during data collection.

2.3.1.4 Imaging System

Once the sample is mounted on the goniometer it has to be visualized for centering into the X-ray beam and for general inspection (e.g., if there is a crystal in the loop or if the sample is covered with ice). Depending on the beamline, one or more high-magnification video cameras are pointed at the crystal (see Figs. 12 and 13) and their images are displayed on monitors outside the hutch. On newer beamlines in-line viewing is common, so that the path of the incoming X-rays coincides with the video camera viewing direction. Since a camera cannot be mounted in the path of the X-rays, this is accomplished by the use of a small prism or a mirror with a hole for the transmission of the X-rays. To obtain a clear and well-illuminated image, there are usually lights installed both behind and in front of the sample and their intensities can be varied as required.

2.3.1.5 Sample Cooling

Exposure of macromolecular crystals to X-rays can destroy the crystalline order very rapidly at room temperature. The process of radiation damage can be dramatically reduced by cooling the sample to cryogenic temperatures. For this reason, basically all protein crystallography experiments today are performed on cryocooled crystals using liquid nitrogen (LN). Owing to its suitable boiling temperature (77.4 K), low cost, and wide availability, LN is well suited for this purpose.

From the time of freezing (usually in the home lab) until exposure to X-rays, crystals must stay cold at all times. This includes sample shipping and handling at the beamline (see Fig. 14). Once the crystal is mounted on the goniometer, it is cooled by a cold nitrogen gas stream with a temperature of 90–100 K. The cold stream is generated by a device called a cryojet or cryostream (see Figs. 12 and 13), which consists of the head containing the nozzle and a separate LN tank. The head is mounted near the sample area (see Fig. 12) with the nozzle pointing at the sample (see Fig. 13). The nitrogen gas is first heated and dried and then cooled down to the required temperature, which can be accurately controlled to better than 1 K and sustained at the sample position (which is 10–15 mm from the tip of the nozzle). To assure an undisturbed laminar flow of the cold nitrogen stream and to avoid sample icing, the cold stream is shrouded in an outer gas stream of dry nitrogen or air. Since this outer stream is not at cryogenic temperature, it is very important that the sample stays well inside the cold stream (ensured by proper alignment).

Fig. 14.
figure 14_5

Sample holder cassettes. The ALS “pucks” (left  ) can hold 16 samples each while the SSRL cassette (right  ) has room for 96 crystals. The Berkeley robot can accommodate six pucks while the Stanford robot can be loaded with three cassettes. Details about these systems can be found in refs. (29, 33).

2.3.1.6 Sample Mounter

High-throughput SBDD requires the handling of a large number of protein crystals at the beamline and this can be accomplished efficiently only by using automation. The fact that the experiments must be carried out inside a radiation shielded hutch further necessitates the use of automated sample handling. Multiple systems have been developed over the past decade (29, 32, 33) and by now many X-ray generators and most synchrotron beamlines are equipped with one. An up-to-date list of current systems is listed on the Robosync Web site (34).

The main challenge in the development of sample mounting systems was the requirement that the crystal must stay at (or close to) liquid nitrogen temperatures at all times. This means that the sample has to be enclosed in a cold environment while in transition between the sample holding Dewar and goniometer, and the mounting and dismounting processes have to happen within a few seconds to avoid warming up. The Berkeley automounter system (BAM), shown in Fig. 12 as part of the BL5.0.3 endstation setup, can mount or dismount in approximately 5 s by using a sample gripper mounted on pneumatic stages (29). Crystals are stored in a Dewar that can hold 96 samples in liquid nitrogen.

Along with the sample mounters, containers for the easy storage, transport, and handling of a large number of frozen crystals have been developed. Two such systems are shown in Fig. 14. These cassettes fit into common dry shipping containers widely used to send crystals to the synchrotron.

2.3.1.7 X-Ray Area Detector (CCD)

The purpose of a crystallographic experiment is the accurate determination of the intensities and positions of the diffracted X-rays. Because of the limited availability of synchrotron beamtime and the decay of crystals due to radiation damage the measurements need to be fast, accurate, and efficient. A detector with a large surface area can measure a large number of spots simultaneously and a short readout time can reduce the duration of the experiment. A high enough spatial resolution is required to separate nearby diffraction peaks and to determine the positions accurately. The detector has to have low noise, has to be sensitive enough to measure weak spots, and has to have a large dynamic range to capture weak and strong peaks at the same time.

Charge-coupled device (CCD) based instruments fulfill the above mentioned requirements and currently they are the most common X-ray detectors at structural biology beamlines. The CCD was invented in 1969 by W. Boyle and G. E. Smith (2009 physics Nobel Prize). CCDs are semiconductor based devices containing wells (capacitors) to store charge and have the ability to shift charge between neighboring wells. Originally intended as a memory device (“shift register”), it was soon realized that charge can be generated by light through the photoelectric effect. This discovery laid the foundation for the electronic capture of images, where each well corresponds to a pixel. CCDs can capture light from the near infrared to the ultraviolet and they are widely used in various applications. Once the CCD has been exposed to light, the image is read out by shifting the charge from well to well by rows toward the edge, where each pixel is amplified and digitized. To speed up this sequential readout process, different parts of the CCD can be read out at the same time. Signal-to-noise can be increased by lowering the noise of the sensor, which can be achieved by cooling the CCD. This is especially important in scientific applications; in astronomy CCDs are sometimes cooled with LN, while the detectors for crystallography use thermoelectric cooling (Peltier elements) in conjunction with water cooling to operate at ∼−50°C.

For protein crystallography X-rays have to be detected over a large surface area, but CCDs are not sensitive to X-rays and are limited in size to approximately 30–40 mm2 (tiling 100 chips together would be prohibitively expensive). X-ray detection is achieved by using a phosphor-coated screen, which converts X-ray photons into visible light. The generated image is reduced in size by a factor of 3–4 through a fiberoptic taper, to which the CCD is attached (see Fig. 15). Tiling 4, 9, or 16 such units together in a 2  ×  2, 3  ×  3, or 4  ×  4 array can provide a large sensitive surface with a limited number of CCDs. The specifications of some CCD detectors in use today are listed in Table 4.

Fig. 15.
figure 15_5

Schematic overview of a CCD area detector for the measurement of X-rays (35). Image reproduced with permission.

Table 4 Parameters of X-ray area detectors for macromolecular crystallography (3537)

Pixel array detectors started to appear recently on protein crystallography beamlines. These devices can detect X-rays directly without the need of conversion to visible light and the active area can be made rather large. They have a very short readout time and large dynamic range; however, their pixel size is rather large and the technology is still new. Specifications of the commercially available Pilatus 6M pixel array detector are also included in Table 4.

Usage of CCD detectors for data collection is described in Section 3.4, along with the problem of overloaded pixels, dark images, and detector binning.

3 High-Throughput Data Collection

Many steps of the structure determination process have become increasingly automated in recent years. These developments have been driven by the government funded structural genomics projects (38, 39) and the pharmaceutical/biotechnology industry’s desire to implement rational drug design strategies (40). Sample handling for X-ray data collection is no exception and at present many X-ray generators and synchrotron beamlines are equipped with crystal mounting robots and sample tracking systems. The application of these systems in a high-throughput environment is described in the following sections with a focus on obtaining protein–ligand complex structures for SBDD projects at synchrotron beamlines. For the sake of simplicity, it is assumed that the beamlines used are equipped with a single-axis goniometer, a CCD X-ray area detector, and a sample mounting robot.

Takeda San Diego, Inc. (formerly Syrrx, Inc.; “TSD”) has been a leader in the field of high-throughput SBDD since its inception in 2000. Its state-of-the-art gene-to-structure platform (4143) has been employed to determine a number of de novo protein structures that have been implicated in various diseases (4449) and allowed production of many protein cocrystal structures in complex with drug candidates to guide the SBDD process. TSD’s efforts have resulted in multiple clinical candidates (50) and a drug on the market (51).

As part of TSD’s protein crystallography platform, all the relevant information on the gene-to-structure process, including the details of crystal harvesting, X-ray screening, data collection, processing, and structure solution, is systematically captured in a centralized database. The author was able to analyze data that has been stored in TSD’s database. Throughout Section 3, results of this analysis are used to highlight certain aspects of the topics being described.

The need for sample handling automation and sample tracking can be clearly demonstrated by analyzing TSD’s data regarding the attrition rate from crystals harvested into loops to structures solved. At TSD, of a representative selection of approximately 20,000 harvested crystals (100%) ∼18,400 were screened (92%) resulting in ∼3,000 datasets (13%) and ∼800 (co)-crystal structures (4%). This corresponds to a total attrition rate of ∼25, underscoring the need to be able to handle a large quantity of crystals efficiently to obtain the desired number of high-quality structures.

3.1 Preparations

The specifications of end-stations and beamlines for protein crystallography can vary considerably (see Table 1), thus the requirements of a specific project have to be examined before deciding which facility to use. MAD experiments require tunable wavelength whereas most cocomplex work can be done at monochromatic sources. For small crystals (<20 μm) a small and intense beam is more advantageous (30). Crystals with large unit cells benefit from a beam with low-divergence, a large detector with 2-Θ capability and a large sample-to-detector distance to separate nearby spots. On the other hand, crystals that diffract well require a small sample-to-detector distance to capture the high-resolution data.

For high-throughput X-ray data collection a sample mounting robot is essential. The different robots found at synchrotron beamlines handle different kinds of storage cassettes (see Fig. 14) and cryoloops and one should bring samples in a compatible format. It is often possible to borrow the cassettes and tools needed for a specific beamline in case the user does not own them.

It is highly recommended to double-check beamline specifications with staff before deciding on a specific facility, as listed/published features can often change. It is also a good idea to confirm the availability of needed features at the beginning of a data collection run to avoid unpleasant surprises “in the middle of the night.” Scientists from commercial entities should bear in mind that there are special rules and extra costs for proprietary research at government-run synchrotron sources and those should be checked beforehand.

3.2 Sample Tracking

Automated handling of samples only works efficiently in parallel with some form of crystal tracking system, which can be as simple as a spreadsheet. Most beamlines equipped with an automounter will let the user upload spreadsheets with crystal information into the beamline control system (templates are usually available from the facility’s Web site). The minimum information needed for work at the beamline is a list of all samples containing unique crystal identifiers and the locations of the crystals, i.e., information on which cassette and which position in the cassette they are stored.

The bulk of TSD’s X-ray data collection work has been performed at ALS beamline 5.0.3. In order to streamline operations and enable fast and accurate information exchange, the TSD database is securely connected over the internet to the MySQL database at BL5.0.3. This has allowed basic crystal information to be sent from TSD to the beamline, and crystal screening and data collection details to be transferred back to TSD. By contrast, when a beamline with no direct connectivity is utilized, spreadsheets or direct entry needs to be used.

Once the cassettes containing the samples have been loaded into the automounter Dewar and the crystal information has been uploaded into the beamline control system, the actual work on crystal screening and data collection can begin.

3.3 Crystal Screening

The purpose of crystal screening is to find the best crystal in terms of diffraction quality from a group of similar crystals, such as crystals of proteins complexed with the same compound.

3.3.1 Crystal Screening Steps

The typical crystal screening workflow comprises the following steps for each sample:

  1. 1.

    Mount crystal from LN dewar on to goniometer.

  2. 2.

    Center crystal in the X-ray beam.

  3. 3.

    Obtain diffraction images, typically two shots 90° apart.

  4. 4.

    Evaluate diffraction.

  5. 5.

    Pause for data collection or further screening.

  6. 6.

    Dismount crystal from goniometer and return it to the dewar.

Mounting and dismounting of the crystals is done by the sample handling robot installed at the endstation. The crystal is usually centered by the user through a “point-and-click” interface but several beamlines also offer automated crystal centering routines. There are different ways to accomplish automated centering, such as centering the cryoloop based on its outline or centering the crystal based on X-ray diffraction or fluorescence (5254). Manual centering is in most cases faster and more reliable, whereas automated centering may be most efficient when incorporated into a fully automated crystal screening protocol.

The settings for the screening of diffraction images should be similar to typical data collection settings on that beamline. These can be learned from beamline staff. As a rule of thumb, a wavelength around 1 Å, spindle oscillation angle of 1°, exposure time of a few seconds and a detector-to-sample distance of 250 mm (for a detector size in the 300–325 mm range) are, in general, appropriate parameters. Keep these settings the same for a group of similar crystals for accurate comparison. To speed up the screening process, dismounting of a crystal and mounting of the next one should proceed while the snapshots are being evaluated.

Sometimes during screening the diffraction quality of the mounted sample is good enough to collect a dataset without screening more crystals and most screening protocols will pause before dismounting the sample. This also offers the opportunity to collect additional screening images if needed. Large samples should be tested at different areas of the crystal as some regions might diffract better than others. Multiple crystals in the same loop or hardly visible small crystals can require testing multiple areas of the cryoloop. For this purpose, some beamlines are equipped with automated “raster” screening protocols, where a loop is divided into a grid and every element of the grid is tested for diffraction.

3.3.2 Evaluation of Screening Images

When evaluating the screening images, the type of diffraction should be determined first: no diffraction, salt or protein. No diffraction means that there is either no crystal in the loop or that there is one but it is not positioned properly in the X-ray beam. If in any doubt, it is advisable to recenter the crystal and collect more images. Owing to their small unit cells, salt crystals produce few, but very strong diffraction spots (often overloading the detector) and thus are easy to identify in most cases (see Fig. 16, left).

Fig. 16.
figure 16_5

Diffraction image from a salt (left  ) and a protein crystal (right  ). All diffraction images were recorded by an ADSC Q315r CCD area detector on beamline 5.0.3 at the Advanced Light Source and displayed by the ADXV image viewer program (55).

Diffraction from a protein crystal (see Fig. 16, right) should be always carefully analyzed. Even if the diffraction quality is not adequate for data collection, accurate screening results will further aid crystallization trials. Figure 17 shows a few spots arranged in rows close to the beamstop, clearly indicating low-resolution diffraction from a protein crystal and this finding should be noted in the screening results. Typical parameters to be determined from the screening snapshots are resolution limit, mosaicity, diffraction strength, spot shape, anisotropy, and the presence of ice rings.

Fig. 17.
figure 17_5

Left: Low-resolution diffraction spots (30–35 Å) from a protein crystal. Right  : Diffraction from a split crystal showing multiple overlapping lattices.

Diffraction experiments are based on order inside crystalline samples. The higher the order, i.e., the positions of atoms are defined more accurately, as well as more uniformly from unit cell to unit cell in the crystal, diffraction spots with higher hkl values will appear. Higher hkl values correspond to diffraction spots with a larger deflection angle from the primary beam and thus will be located further away from the beam center on the detector surface. The location of the outermost visible spots is measured in Ångstrom and is called the resolution (or resolution limit) of a particular diffraction pattern. It is best to determine the resolution in the corners of the diffraction image because spot intensities are enhanced along the spindle rotation axis (usually the horizontal) and yield a too optimistic estimate for the resolution.

Protein crystals are not perfect. They can be described as a collection of small perfect crystals (also called mosaic blocks) with each having a slightly different orientation angle. The width of the distribution of these angles defines the mosaic spread, which is measured in degrees. Signs of high mosaicity are a large number of diffraction spots (often suppressing a clear view of the lunes) and tangentially streaky spots (see Fig. 18). At TSD, mosaicity is classified into low, medium, and high categories.

Fig. 18.
figure 18_5

Examples of diffraction from highly mosaic crystals. Data processing yielded 1.7° mosaic spread at 2.4 Å resolution (left  ) and 1.1° mosaic spread at 1.9 Å resolution (right  ).

Diffraction strength describes the overall intensity of diffraction spots in the image. Every image consists of a number of pixels corresponding to the resolution of the CCD detector (see Table 4). Every diffraction spot is made up of multiple pixels and the sum of those pixel intensities yields the spot intensity. Image viewer programs (e.g., ADXV (55)) show the average intensity of all pixels, the maximum intensity and the number of overloads. By magnifying a section of the image individual pixels can be also examined. Based on all this information and the overall visual appearance of the diffraction pattern, TSD categorizes diffraction strength into weak, medium, and strong.

Spot shape can be evaluated visually, it helps to magnify a region of the diffraction image for this purpose. At TSD, spot shapes are classified into clean, streaked and” split categories. Streaky spots can be an indicator of high mosaicity (see Fig. 18), whereas split spots signal the presence of multiple lattices (see Fig. 17).

Anisotropic diffraction can manifest itself in a single diffraction image, where the spots are roughly within an ellipse and not a circle. It can also mean that the diffraction quality is very different between the images taken 90° apart. All the quality indicators should be noted conservatively in these cases (e.g., a few high-resolution spots in one orientation are not going to result in a complete dataset at that resolution).

Diffraction from ice crystals can be recognized by the strong diffraction rings at 3.9, 3.7, 3.5, and 2.3 Å resolutions (and other, weaker rings) as shown in Fig. 19. Ice rings can originate both from the inside and the surface of the protein crystal. Internal ice is a result of inadequate cryoprotection and can be sometimes recognized by a nonoptically transparent cryoloop. The corresponding ice rings are often broad without any sharp features. Ice buildup on the surface of the cryoloop occurs during crystal handling and is usually clearly visible. The corresponding ice rings are narrow and sharp.

Fig. 19.
figure 19_5

Examples of ice crystal diffraction rings. The pattern on the left is typical of ice located on the surface of the cryoloop.

An increasing number of beamlines are equipped with automated crystal analysis software, which can help determine the parameters described above. A popular version is called Web-Ice, which was developed at Stanford Synchrotron Radiation Laboratory (SSRL) and includes the LABELIT (56), and DISTL (57) software packages.

3.3.3 Crystal Screening at Takeda San Diego

As described above, all crystallography related data are stored in a database at TSD. This has enabled a closer inspection of the screening results for a large number of crystals as well as trends within a set of crystals. Of a selection of approximately 17,000 screened crystals, 12% did not diffract and only 2% were salt crystals. These low numbers can be explained by the high percentage of protein/small-molecule cocomplex structures for which the crystallization conditions have been previously optimized. Only 10% of the screened crystals showed diffraction from ice, indicating that crystal freezing and handling is well under control.

To find out how well the manual screening evaluation really describes diffraction quality, the screening results and the processing statistics were compared for those crystals for which a dataset was collected. The average mosaicity (as measured by the averaged angular width of the diffraction spots from a crystal) of all crystals processed (approximately 2,500 of the 17,000) is 0.59°. For those which were defined as low, medium or high mosaicity during screening, data processing yielded measured average mosaicities of 0.49°, 0.81° and 1.05°, respectively. Similar, well correlated behavior was found for the diffraction strength measured as the overall spot intensity-to-background (I/σ) ratio (I/σ being 15, 18 and 22 for weak, medium and strong diffraction respectively).

Comparing the true data resolution with the values from screening reveals a clearly linear dependence (see Fig. 20). It also underscores the notion that during data collection a somewhat better resolution can be achieved than that based on the visible spots on a snapshot (see the equation in Fig. 20). This is mainly due to averaging of symmetry related diffraction spots, which improves the signal-to-noise ratio. The simple evaluation procedure at TSD clearly gives a good assessment of the diffraction quality of the crystals under investigation.

Fig. 20.
figure 20_5

Comparison between screening and processing resolution values.

3.3.4 Crystal Selection for Data Collection

Before data collection can begin the best crystals have to be selected from the screening process. High-resolution, low mosaicity, strong diffraction, clean spot shape and the lack of ice rings are the prized crystal properties. These conditions are not always all met and some tradeoffs have to be made, e.g., a somewhat higher mosaicity can be accepted (up to approximately 1°) in return for higher resolution. Streaky or split diffraction images should be indexed (if the crystal is otherwise promising), and if a solution is easily obtained, the crystal will most likely yield a useful dataset. Ice contamination should not prevent data collection, unless it is very severe or the resolution limit is close to the ice ring at around 3.5 Å (sometimes ice can be “washed off”). Looking at the impact of ice rings on a number of datasets at TSD, it was found that the overall completeness was reduced only by 1% and the impact on overall R sym was negligible.

If time permits, and the desired structure important enough, data should be collected on as many crystals as possible, even on crystals that are not of the highest quality. At TSD, useful structures for SBDD were obtained from resolutions as poor as 3.5 Å, mosaicity as high as 2.4° and overall completeness as low as 80%.

As a last resort, annealing of a crystal can be tried to improve diffraction quality. During this process, the crystal is warmed up to room temperature (by blocking the cryostream for a few seconds) and cooled down again. In some cases, the lattice will rearrange to a system with superior order and diffraction resolution improves. Occasionally this method works but usually it completely destroys the crystal!

3.4 Data Collection Strategies

There are a number of parameters that have to be set up for each data collection run on the chosen samples; how to optimize those parameters is the subject of this section. The screening results and possible prior knowledge of the project should provide most of the necessary information for this purpose. The goal of data collection is to accurately measure a complete set of diffraction intensities (58). Only monochromatic data collection is discussed here for the purposes of solving structures by the molecular replacement method and to obtain cocomplex structures for SBDD.

With the worldwide proliferation of synchrotron light sources in recent years, availability of beamtime is no longer a limiting factor. For this reason, even in a high-throughput setting, the focus can be shifted from quantity to quality. The best possible datasets should be collected for all samples. This trend is accentuated by the significant performance increase of beamlines, leading to very high intensities and small beam sizes. Interestingly, these developments put another kind of time limit on the data collection process, namely the “survival time” of the crystals before radiation damage occurs (59). Radiation decay, or damage, is the process whereby the energy absorbed by the sample from the impinging X-rays gradually destroys the order in the crystal and thus reduces its diffracting power.

3.4.1 Crystal Symmetry and Unit Cell

Crystal symmetry and unit cell dimensions are crucial pieces of information when determining the correct data collection parameters. For crystals of an ongoing SBDD project, symmetry and unit cell will be already known, and if indexing of the screening images confirms this no more steps are necessary. The situation is similar in the case of a new project, for which a structure is already published, and indexing confirms the published parameters. The crystal information is not known for de novo structures or if the crystal properties change for complexes with different compounds. In these cases finding the proper symmetry is a multi-step process that is described in detail in Section 3.5.5.

3.4.2 Data Collection Parameters

Typical parameters that can be selected by the user for data collection are as follows: X-ray energy, detector-to-sample distance, detector 2-Θ angle, exposure time, oscillation angle, starting rotation angle, rotation angle range, beam size, beam divergence, and detector binning. How to find the best possible settings to obtain an accurate and complete dataset for a given sample is described in detail in the following sections. Optimization is done by collecting a few diffraction images (snapshots) before starting the data collection run and adjusting the parameters.

3.4.2.1 X-Ray Energy

On monochromatic beamlines the photon energy cannot be changed, but its value should be noted for later reference. On tunable beamlines multiple considerations can be taken into account. The beam intensity usually varies with photon energy, and a value corresponding to the highest flux can be selected. Most CCD area detectors are optimized for maximum sensitivity at 1 Å wavelength. At lower energies there is an increase of X-ray absorption by the sample (resulting in increased radiation damage) and also by the air surrounding the crystal (leading to a higher scattered background and attenuation of the diffracted X-rays). At higher energies radiation damage and air scatter will be lower. In general an energy close to 12.4 keV (1 Å) is a good choice for most practical purposes. When deviating significantly from 1 Å, all the above listed considerations should be taken into account to find the best compromise. One should also make sure that the beamline is optimally tuned for the energy to be used.

3.4.2.2 Detector-to-Sample Distance, 2-Θ Angle, and Beam Divergence

The two main factors influencing these settings are the expected resolution limit and the spacing of the diffraction spots. The former is known from screening, while the latter is influenced by the size of the unit cell, crystal mosaicity, and the presence of multiple lattices (caused by twinning, split or multiple crystals).

Most beamline control software contains a resolution calculator, which, taking into account the wavelength and the detector size, shows the resolution limits on the edges and the corners of the detector surface for variable detector-to-sample distances (D) and detector 2-Θ angles. The following steps describe how to find the optimal settings.

  1. 1.

    In the resolution calculator, set 2-Θ  =  0 and find D for the expected resolution (a few one tenth of an Ångstrom better than the screening resolution) at the edge of the detector surface. Take two snapshots 90° apart (using the anticipated values for all the other settings), and evaluate the images. If the detector surface is filled evenly with diffraction spots to the edge, and the spots do not overlap, data collection can begin (see Fig. 16, right).

  2. 2.

    If the spots do not overlap and the detector surface is filled beyond the edge possibly into the corners or beyond, D has to be decreased to capture the high-resolution portion of the data. On the other hand, if the detector surface is not filled with spots to the edge, D needs to be increased (see Fig. 18). However, the distance should not be increased too much to fully illuminate the detector surface if the spots are already well separated. A larger than necessary D results in additional absorption of the X-rays by air (air absorbs 6% of 1 Å radiation at 200 mm and 14% at 500 mm) and an increase of the spot size due to beam divergence.

  3. 3.

    If the detector surface is filled to the edge, but the spots overlap, either the beam divergence has to be decreased or 2-Θ and D have to be increased, or both. Reducing the divergence also decreases the beam intensity, which puts a limit on the minimum usable divergence. To further separate overlapping spots the detector distance has to be increased. To avoid losing high-resolution data, the 2-Θ angle also has to be adjusted at the same time. Additionally, the oscillation angle used during data collection can be decreased to reduce spot overlap (see Section “Starting Spindle Axis Rotation Angle, Rotation Angle Range, and Oscillation Angle” below).

In practice a combination of all those settings will provide the right balance. At the ALS beamline 5.0.3, which is equipped with a Q315r detector and λ  =  0.98 Å, the following settings are used for a crystal with a longest unit cell axis of ∼400 Å. Depending on the resolution limit, a detector-to-sample distance in the range of 390–430 mm, a beam divergence of 1.5–1.8 mrad (instead of the standard 2.5 mrad, losing 30–40% flux), and a 2-Θ angle of 0–5° is used.

3.4.2.3 Exposure Time

Exposure time is the length of time (measured in seconds) for which the shutter is open during the collection of a single diffraction image (during which time the goniometer spindle axis rotates by the chosen oscillation angle). The total data collection time for a dataset is the number of images multiplied by the sum of the exposure times, detector readout time, and beamline overhead. Detector readout time is around 1 s (see Table 4) while beamline overhead is typically in the range of 1–3 s. This means that there is a lower limit on the total data collection time given by the fixed readout and overhead durations. The opening and closing of the shutter is precisely synchronized with the rotation of the spindle axis during the collection of each image. If the exposure time is very short (<∼0.5 s), inaccuracies in the synchronization might negatively influence the data quality. For these reasons, even on very intense beamlines the minimum exposure time should be kept in the 0.5–1 s range and the beam should be attenuated if necessary.

Although longer exposure times can be desirable to measure weak reflections accurately, there is an upper limit to exposure time dictated by detector overloads and radiation damage. The dynamic range of the X-ray detector is limited such that there is a maximum intensity that can be stored in every pixel (see Table 4). Once this maximum is reached and passed, the overloaded pixel(s) will not contain useful information about the corresponding diffraction spot(s) and those spots are lost for the structure solution process. A few overloaded pixels (which will most likely be in one or two diffraction spots) are permissible because they will influence low-resolution completeness only slightly and it helps ensure that the full dynamic range of the detector is being used. Overloads are mostly an issue with strongly diffracting crystals.

A rough estimate for the length of data collection before radiation damage occurs on different beamlines can be found in (60). One should keep in mind, however, that radiation damage depends on many factors, such as crystal size, heavy atom content, and X-ray energy (59). Many crystals of the same project (same protein with different compounds) are being collected on in an SBDD environment, and therefore, the properties of these samples will be well known (space group, sensitivity to radiation, typical mosaicity, and resolution). If data are collected under similar conditions on the same or similar beamline (and the proper data collection strategy has been determined), the exposure time can be selected to be near the radiation damage limit. If the sample is less well understood, e.g., the crystal symmetry is not certain yet and the collection of a full 180° dataset is needed, the exposure time should be selected accordingly with an appropriate safety margin. Signs of radiation damage are discussed in the next section under data processing (Section 3.5.4).

On some beamlines data can be collected in “dose mode” where the exposure time of each frame is adjusted to compensate for the decay of the electron current in the storage ring. This is a useful feature for rapidly decaying ring currents, but it is unnecessary for operations in top-off mode.

3.4.2.4 Starting Spindle Axis Rotation Angle, Rotation Angle Range, and Oscillation Angle

To obtain a complete dataset, the starting rotation angle and the rotation range have to be set up correctly. They are determined by the crystal symmetry (the point group) and the crystal’s orientation with respect to the X-ray beam. While the crystal symmetry might be already known (Section 3.4.1), the crystal’s exact orientation can be determined by indexing a snapshot (Section 3.5.2). Based on this information, an accurate data collection strategy can be calculated using either dedicated software, such as BEST (61) or Strategy (62) or the built-in predictions in data processing packages (Section 3.5). Data collection should start a few degrees before the recommended value and should always be set up for at least the full 180° rotation range. When the crystal symmetry is ambiguous, collection of the full 180° ensures completeness in any point group. With known symmetry, the proper starting angle will make sure that completeness is reached as soon as possible and if the crystal has not suffered radiation damage (and time permits), more data can be collected to increase the precision of the intensity measurements through the subsequent increased redundancy. There are also other reasons why more data might be needed to reach completeness; some diffraction spots might be rejected during processing due to overlaps with other spots or ice rings; some low-resolution spots might be overloaded in some orientations but not in another; and if a 2-Θ offset is used, high-resolution spots are measured only on one side of the detector surface at a time, and thus, more rotation range is needed.

Special attention needs to be paid to crystals with an anisotropic diffraction pattern, i.e., when diffraction is better in one orientation and worse in another. This can be the case with plate-shaped crystals, for example. In the higher symmetry point groups, there are multiple equivalent choices for starting angles (e.g., every 90°) and one should be chosen where diffraction is best. In this way, the higher quality data are collected first before radiation damage sets in.

As with several other parameters, the choice of the oscillation angle depends on a number of often competing factors. A larger value will result in more diffraction spots on a single image since more lattice points will fulfill the Bragg condition. This might sound enticing, since fewer frames would be needed and thus the data collection time could be shortened (by shortening the total detector readout time). However, higher mosaicity, resolution limit, and beam divergence and a larger unit cell dimension (the one along the beam direction at any given time) will also result in more diffraction spots and thus will limit the maximum allowed oscillation angle for which the spots do not overlap (58). Besides avoiding overlaps, a smaller oscillation angle will have the benefit of an improved signal-to-noise ratio (fine-slicing is not discussed here). Some strategy programs will also recommend what oscillation angle to use and some even allow the simulation of a dataset with different settings to see if overlaps will be a problem. In practice, an oscillation angle in the 0.5–1.0° range works well for most projects.

3.4.2.5 Beam Size and Detector Binning

In general, the beam size should be set to match the size of the crystal (not the loop!). This will help reduce the scattering background from air and the parts of the loop around the crystal. One has to make sure, however, that the beam covers the crystal in all orientations of the spindle axis. If the goal is to collect multiple datasets on the same crystal (due to radiation damage), a beam size smaller than the crystal should be chosen so different parts of the same sample can be exposed subsequently (see Fig. 21) (30). A properly adjusted beam size can be also used to select a single crystal for data collection when multiple crystals are present in the same loop.

Fig. 21.
figure 21_5

Three separate data sets were collected on this needle shaped crystal using a 20 μm X-ray beam at the APS ID-23B beamline. Discolorations of the mother liquor surrounding the crystal show where the X-rays hit the loop (marked by arrows).

Most modern beamlines are equipped with large CCD area detectors, such as the ones listed in Table 4. Owing to the large number and small size of the pixels standard data collection can be carried out in binned mode, where intensity from four neighboring pixels (in a 2  ×  2 matrix) is combined. Binned mode can speed up detector readout and results in smaller file sizes (e.g., 18 vs. 72 MB for each image of the Q315r).

To reduce the effects of light-independent noise generated in the detector, so called dark images have to be collected regularly. A dark image is taken first, with the exact same settings (exposure time) as the diffraction images, but with the shutter closed. The dark image is then subtracted from the subsequent diffraction images by the detector control software. Dark images are handled automatically by the beamline software for the most part (e.g., a new dark is collected before every data collection run), but they can be also requested by the user if needed and it is useful to be aware of this background process.

3.5 Data Processing

During data processing the intensities of all the measured diffraction spots in a data set are accurately determined, combined according to the crystal symmetry and sorted into a final list of hkl indices. This process is also called data reduction and involves multiple steps: finding spots, indexing, refinement, integration, and scaling. Some common software packages for the processing of macromolecular diffraction data are HKL2000 (63), MOSFLM (64) and d*Trek (65). Comments throughout this section are HKL2000 specific, since this is the package that has been used at TSD. There is an extensive amount of information available on data processing (manuals, Web sites), thus only a brief overview is given here with special attention to some specific problems.

Data processing should start immediately after data collection has been initiated. This is very important, because problems with the data collection can be caught early and corrected before it is too late. It also enables continuous monitoring of completeness and other data quality indicators that can point to radiation damage or signal if data collection can be stopped.

3.5.1 Finding Spots

To start data processing the dataset has to be loaded into the software. Every diffraction image contains data collection related information (e.g., wavelength, oscillation angle, detector distance) that is automatically read into the program. It is a good idea to briefly double-check if those values are correct and if all the relevant information has been loaded. If not, corrections/additions should be done before proceeding.

Once the first frame is displayed, performing a peak search will find the most intense diffraction spots to be used for autoindexing. On a clean and strong diffraction pattern no more steps are necessary and indexing can begin. Problematic diffraction patterns are discussed below in Section 3.5.3.

3.5.2 Indexing and Refinement

Determination of the possible crystal lattices that correspond to the pattern of diffraction spots found during peak search is called indexing. After indexing is complete, the program provides a list of possible Bravais lattices and corresponding unit cell parameters that match the data (marked green in HKL2000). If the crystal symmetry and unit cell are already known and the right Bravais lattice is shown as an indexing solution, it should be selected and one can move on to refinement.

During refinement a number of crystal and detector parameters are optimized using a least-squares fit to achieve the best possible match between the measured positions of the diffraction spots and their predicted positions based on the crystal symmetry and unit cell parameters. Reliability and accuracy are greatly increased if the known input parameters are provided accurately (e.g., sample-to-detector distance). Select a high-resolution limit beyond the visible spots, refine all the parameters (check “Fit All” in HKL2000) and to also optimize mosaicity using multiple images simultaneously (enter three to five in the 3D window in HKL2000).

To confirm that the right lattice has been picked and everything is in order, the refinement results should be inspected. The χ 2 values (indicating the quality of the fit) should be close to 1 and preferably below 3–4. Unit cell dimensions, sample-to-detector distance and beam position should all be close to their expected values. Visual inspection of the diffraction image should confirm that the predicted spots lie accurately on the measured peaks. If everything is satisfactory, integration can begin. The settings used for refinement (resolution limit, 3D window, etc.) will be also used during integration.

3.5.3 Indexing Problems

Two examples of indexing failures are discussed here. In both situations, it is assumed that Bravais lattice and unit cell parameters are known from previous work.

In the first scenario, the diffraction pattern is clean, there are no obvious signs of a second lattice and the mosaicity is not exceptionally high (below ∼1°). Indexing fails, however, with one of the following symptoms: indexing finds a different symmetry and/or unit cell parameters than expected; only the triclinic Bravais lattice matches the data; the right symmetry and unit cell parameters are found, but during refinement the χ 2 values are high and/or some other parameters are far off expected values. In almost all cases the problem is a wrong initial setting for the X and Y positions of the primary X-ray beam. This is easily corrected. Often a visual inspection of a diffraction image can provide good values for the beam position. Occasionally some other setting is off by a large margin: a good candidate to check is the detector 2-Θ offset. Sometimes this is not read into the software and one has to set it manually. A lattice belonging to a large unit cell crystal, i.e., when the spots are close together, is most sensitive to beam position settings. For such samples sometimes the problem does not appear until a later stage of processing: indexing, refinement and integration can all work well but scaling fails (scaling is discussed in the next section). This is due to the fact that the calculated beam center coincides with a low-resolution axial diffraction spot, causing a small indexing error along one axis.

In case of a “messy” diffraction pattern indexing can be difficult and it is an iterative process between finding the right spots, indexing and refinement. If the protein diffraction is contaminated by ice rings, cutting back the high-resolution limit to ∼4 Å will eliminate their influence. If the crystal is split and multiple lattices are clearly present (and autoindexing cannot pick one out) manual addition and deletion of spots after the peak search can help. For many difficult cases changing the following settings can help: high-resolution and/or low-resolution limit can be varied; diffraction images from different spindle rotation angles can be tried (ideally a crystallographic zone is visible in some orientation, but in highly mosaic patterns those are hard to recognize); a different number of images can be used at the same time; the size and/or number of peaks during spot finding can be varied (more/less peaks, peak size up/down in HKL2000).

If severe problems with indexing (or integration/scaling) persist, there is either something wrong with the experimental setup or the crystal is not usable. It is a good idea to start a synchrotron run with good crystals, to confirm that there are no problems with the beamline. Processing of these crystals can also reveal how well the beam center and other parameters are set up and those values can be noted and used later.

3.5.4 Integration and Scaling

During integration the diffraction spot intensities are determined for every frame of the dataset. Before starting the integration the anticipated number of frames should be entered if the processing is done during data collection (in HKL2000 on the “Data” tab select “Edit Set(s)”). If indexing was not straightforward, settings for the integration should take that into account, e.g., the integration spot size should be smaller than usual if nearby spots are to be distinguished (e.g., for split crystals).

To take into account small changes in the experiment and the crystal during data collection, crystal and detector parameters are refined for each image. In HKL2000, the refined parameters are shown in a graph and one should keep an eye on their evolution. Small gradual changes are fine, but if any parameter has a sudden change, it could indicate a problem. For example if the crystal orientation changes suddenly (Rot X, Rot Y and Rot Z in HKL2000) it could mean that the crystal slipped on the goniometer. A gradual but steady increase of mosaicity and/or significant changes in the unit cell parameters can be indications of the onset of radiation damage.

In the final steps, the intensities of the images are placed on the same scale (scaling) and the symmetry related intensities are merged together (merging) to generate the final output file of hkl indices and the corresponding intensities and their uncertainties (the.sca file in HKL2000). After scaling and merging the overall data quality can be evaluated and the resolution limit determined. This is an iterative process. Resolution limit and other settings (e.g., error model, scaling and B restraints, exclusion of frames) are changed until the final results are satisfactory. The space group is also set during scaling.

The most important data quality indicators are: R merge, completeness, redundancy and the signal-to-noise ratio I/σ(I  ). These parameters and other relevant information can be found in the scaling log file. Overall completeness >95%, redundancy higher than fourfold, R merge  <  0.5 and I/σ(I  )  >  2 are good initial guidelines for stopping data collection or determining the resolution limit. However, a good balance of all these values should be obtained in each specific case. Crystals with low symmetry will have a redundancy lower than four. An I/σ(I  ) lower than two can be allowed if the other parameters are still good (since there is actual signal as long as I/σ(I  )  >  1). Similarly, a lower completeness in the high-resolution shell will still contain some real information (in practice one can go as low as 50–60% if I/σ(I  ) is good).

A detailed discussion of R merge is beyond the scope of this review. Owing to the definition of this parameter, its value increases with increasing data redundancy and thus suggests worsening data quality although the precision of the measurement is actually improving. For this reason improved R-factor schemes have been proposed (66, 67) but are still not in widespread use.

3.5.5 Finding the Correct Space Group

The most important rule to follow if the crystal symmetry is unknown or is in doubt is to collect 180° of data. Processing should start with the highest possible symmetry based on indexing. If processing fails during integration (the refinement parameters diverge strongly) or scaling (yielding high R merge and high rejections) the chosen Bravais lattice is incorrect and one should try the next highest one. Once integration and scaling succeeded the correct point group belonging to the chosen Bravais lattice has to be determined (4 or 422 for tetragonal; 3, 321, 312, 6 or 622 for hexagonal; 23 or 432 for cubic). This happens through simple trial and error. If scaling fails the point group is incorrect.

Screw axes in the lattice result in the absence of reflections belonging to certain hkl values. These systematic absences can be used to narrow down the list of potentially correct space groups further. If the intensity of reflections that should be absent is larger than zero, the chosen screw axis is incorrect. Some screw axes are indistinguishable based on the diffraction pattern alone (e.g., P41 and P43) and can be determined with certainty only during the structure solution process. Even for well established systems the presence of systematic absences should always be checked to potentially detect a change of the space group (e.g., from P212121 to P21212). At the beginning of a data collection run the systematic absences might not be present and one should check again at a later stage.

If finding the correct space group turns out to be a challenge, one can try the following. A solvent content calculation for the crystal can rule out certain space group/unit cell indexing solutions and can even point to the crystallization of the wrong protein. If the diffraction pattern is not clean or there are other problems (e.g., incorrect detector distance), and only the triclinic Bravais lattice appears as an indexing solution, it should be chosen and refinement should be performed (with mosaicity fixed). Sometimes refinement will improve the parameters enough to allow the correct higher symmetry Bravais lattice to appear as a solution.

If time permits it can be a good idea to process a dataset in multiple space groups and to avoid the need to reprocess at a later stage when the experimenter has left the beamline or the data at the beamline is not available.

3.6 Good Practices

3.6.1 Bookkeeping

As with all scientific research, meticulous note taking is essential for protein crystallography experiments. This is especially true at synchrotron beamlines, since the measurements are carried out on equipment owned and operated by someone else and used by many different scientists in close succession. Once an experiment is done, it can be very difficult (or impossible) to find out what certain settings were weeks or months earlier. Even if settings are automatically logged in the background, the scientist’s own accurate notes are invaluable. Those notes should contain among others all the relevant settings for every data collection run, details of data processing, and the main results.

3.6.2 Data Backup

Data collection and processing is generally performed on the beamline computers of the synchrotron facility. Since beamlines are utilized by many different user groups and storage is limited, data are usually deleted within 1–2 weeks after an experiment is finished. For this reason it is essential to perform a backup of all the raw and processed data as soon as possible after the experiment is done. The most common forms of backups are DVD’s, external hard drives and secure file transfer over the internet. Compressing the raw data after processing can save storage space and reduce backup times. If possible, compressing and data transfer of finished datasets should occur concurrently to data collection.

3.6.3 Beamline “Etiquette”

Synchrotron beamlines are used by many scientists and a changeover between user groups can happen within an hour. It is important to be mindful of all the other users around you and not disturb their work. If you have to stay on to finish data processing or backups after your run has finished, do it in a way that you will not interfere with the next user.

Beamlines are very complex instruments and should be used carefully and thoughtfully according to the instructions of beamline staff. Mechanical and software controls should prevent damage to equipment, but this does not mean that the built-in protections are flawless. If in any doubt, ask somebody knowledgeable. Safety procedures should be followed, especially regarding the use of liquid nitrogen.

If you are reading this book, you are most likely conducting proprietary research. Keep in mind that beamlines are a rather open environment, where scientists from different institutions and companies work in close proximity. Guard your proprietary information well and do not leave confidential material “lying around” for others to see.

4 Outlook

Over the past decade the field of macromolecular crystallography has enjoyed major advances. Improved synchrotrons, beamlines and X-ray detectors combined with sample handling automation eliminate many of the technical bottlenecks associated with the acquisition of diffraction datasets. Software developments have enabled the quick and robust analysis of data. Overall the structure solution process for routine cases has become significantly more efficient and is being done on an industrial scale in many centers. A natural limit has been reached in certain aspects, which hinders further efficiency improvements (radiation damage, protein crystallizability, human interaction times). In this sense, some of the current developments are evolutionary in nature, improving certain aspects of current systems and upgrading most experimental stations and synchrotrons to similar standards.

However, as new and more challenging projects are being tackled in many laboratories, there will be a strong emphasis on hands-on scientific research that cannot be automated. On the, the technological barriers will be pushed further to overcome other current limitations, such as performing diffraction experiments on crystals as small as a few microns in size.