IstA purification
G. stearothermophilus full-length IstA was purified as previously described21. In brief, the full-length transposase sequence was expressed and purified using a pET-derived vector with a TEV-protease-cleavable MBP-His6 tag at the C terminus (2CcT vector) in the C41(DE3) Escherichia coli strain, induced after reaching an optical density (OD) at 600 nm of 0.7 with 0.5 mM β-isopropyl-d-1-thiogalactopyranoside during 4 h of incubation at 37 °C in 2×YT (yeast extract tryptone) medium supplemented with 1% glucose. Cells were resuspended in lysis buffer (50 mM HEPES pH 7.5, 750 mM NaCl, 5 mM MgCl2, 5% glycerol and 1 mM β-mercaptoethanol) supplemented with protease inhibitors (complete, mini protease inhibitor tablets; Roche), lysed by sonication and centrifugated (20,000g, 30 min, 4 °C). The soluble fraction was filtered through a 0.45 µm pore-size filter and bound to a Ni-Sepharose column (5 ml His-Trap HP Chelating; Cytiva), washed with lysis buffer containing 30 mM imidazole and eluted (with 50 mM HEPES pH 7.5, 150 mM NaCl, 5 mM MgCl2, 5% glycerol, 300 mM imidazole and 1 mM β-mercaptoethanol) directly onto a cationic exchange column (5 ml Hi-Trap SP HP column; Cytiva). The ionic exchange column was further washed with 50 mM HEPES pH 7.5, 300 mM NaCl, 5 mM MgCl2, 5% glycerol and 1 mM β-mercaptoethanol, and the sample was eluted with a salt gradient up to 1 M NaCl. After cleaving the MBP tag by incubating at 4 °C overnight with TEV protease, the sample was passed through a second histidine-affinity column. The protein was finally concentrated and injected into a preparative HiPrep Sephacryl S-200 16/60 HR column (Cytiva) equilibrated in 50 mM HEPES pH 7.5, 750 mM NaCl, 5 mM MgCl2, 5% glycerol and 1 mM β-mercaptoethanol, at room temperature. Purified transposase was concentrated, aliquoted and flash-frozen in liquid nitrogen and stored at −80 °C until later use.
IstB purification
G. stearothermophilus full-length IstB was purified as previously described27. In brief, His6-MBP-IstB was expressed using a pET-derived vector in E. coli BL21codon-plus (DE3) RIL cells (Stratagene). Cells were grown at 37 °C in 2×YT medium, induced at an OD at 600 nm of 0.8 with 0.5 mM β-isopropyl-d-1-thiogalactopyranoside at 37 °C for 3.5 h, collected by centrifugation, resuspended in resuspension buffer (20 mM HEPES pH 7.5, 500 mM NaCl, 5 mM MgCl2, 5% glycerol, 30 mM imidazole, 1 mM β-mercaptoethanol and 0.1 mM ADP) supplemented with protease inhibitors (complete, mini protease inhibitor tablets; Roche), and lysed by sonication. The supernatant obtained after centrifugation (20,000g, 30 min, 4 °C) was filtered through a 0.45 μm pore-size filter (Sartorius), run over a Ni-Sepharose column (His-Trap HP, Cytiva) and then washed with resuspension buffer. The protein was eluted with a step-wise imidazole gradient. Peak fractions were pooled and run onto an amylose-affinity resin equilibrated with resuspension buffer and cleaved on-column overnight at 4 °C with PreScission-protease. After elution of the cleaved protein with resuspension buffer, the sample was concentrated with ultrafiltration devices and injected into a HiPrep Sephacryl S-200 16/60 HR gel-filtration column (Cytiva) preequilibrated in resuspension buffer at room temperature. Fractions corresponding to IstB were pooled and concentrated, aliquoted and flash-frozen in liquid nitrogen and stored at −80 °C.
In vitro integration assay
Integration reactions were performed in 50 mM HEPES pH 7.5, 150 mM NaCl, 5 mM MgCl2, 5% glycerol, 0.05 mg ml–1 BSA and 1 mM ATP. IstA was pre-incubated with the right donor DNA (R-TIR with a 5-nt-long 5′ overhang), both present at 0.25 µM in the final reaction. IstB (2 µM final concentration) was independently pre-incubated with supercoiled plasmid pSG483 (10 nM final concentration) serving as the target DNA. Both proteins were then mixed in 30 µl and incubated for 15, 30 or 60 min at 37 °C. To stop the reactions, the mixtures were incubated for 20 min at 37 °C with proteinase K (0.25 mg ml–1 final concentration), SDS (1% final concentration) and EDTA (28 mM final concentration). Samples were run for 18 h on 1.5% (w/v) Tris–acetate–EDTA (TAE) agarose gels (40 mM sodium acetate, 50 mM Tris–HCl pH 7.9 and 1 mM EDTA) at 2–2.5 V cm–1. To visualize the DNA, gels were stained with 0.5 μg ml–1 ethidium bromide in TAE buffer for 20 min, destained in TAE buffer for 10 min and exposed to UV transillumination. DNA bands were quantified using Image Lab software (v.5.2.1, Bio-Rad).
Cell-free transposition assay
Cells extract was obtained as previously described23. In brief, E. coli BL21 (DE3) cells were grown until reaching an OD at 600 nm of 0.6, collected by centrifugation and resuspended in 25 mM HEPES pH 7.5, 2 mM DTT and 100 mM KCl. Cells were treated with 250 μg ml–1 lysozyme for 20 min and 10 mM MgCl2 for 30 min. Thereafter, cells were frozen in liquid N2 and lysed by thawing on ice. Cell debris was removed by centrifugation at 14,000 r.p.m. for 30 min. Transposition reactions were carried out in 20 μl final volume containing 16 μl of reaction buffer (25 mM HEPES pH 7.5, 50 mM KCl, 10 mM MgCl2, 1 mM DTT, 50 μg ml–1 BSA, 150 μM dNTPs and 1 mM ATP), 1 μM IstA, 4 μM IstB, 50 ng of donor plasmid (1B-LIC vector containing lacZ flanking by IS21 TIRs) and 1 μl of cell extract. Reactions were incubated at 37 °C for 60 min and the frequency of insertions relative to the control without proteins was determined by quantitative PCR with a set of primers corresponding to a specific region of the donor plasmid (primer forward: 5′-TGTAATTCAGCTCCGCCATC-3′, primer reverse: 5′-GGTGTCTCTTATCAGACCGTTTC-3′) and a set of primers corresponding to the transposition product (primer forward: 5′-CGATTACTGCATCATTCCATCATTT-3′, primer reverse: 5′-AGGACCTTTCATTGATCCTTCTG-3′). Each mixture (10 μl) contained 1 μl of transposition reaction, 500 nM forward primer, 500 nM reverse primer and 1× SYBR. Fluorescence was measured using a LightCycler 96 Instrument (Roche). Data were analysed using the 2–ΔΔCt method normalizing with donor plasmid Ct.
ATPase assays
Reactions (50 µl) were set up containing 50 mM HEPES pH 7.5, 150 mM NaCl, 10 mM MgCl2 and 1 mM ATP. IstB (5 µM) was then added either alone or in combination with 0.5 µM IstA. After 1 h of incubation at 37 °C, ATPase activity was measured using a PiColorLock Phosphate Detection system (Abcam). Plates were scanned in a Varioskan LUX microplate reader (Thermo Scientific).
STC DNA reconstitution
To reconstitute the STC DNA, we designed a construct that mimics the transposition product36,37,38,55. The construct was obtained by annealing three single-stranded DNA molecules (Fig. 2b and Extended Data Fig. 4). The longest strand (TIR-transferred strand; 130 nt) contains the complete sequence of the right transposon end (including the two R1 and R2 repeats), the target DNA and the insertion site. The other two DNA molecules contain the sequences complementary to the donor (non-transferred strand; 60 nt) and target (target-reverse complement; 70 nt) DNAs. The individual oligonucleotides (Integrated DNA Technologies) were resuspended in 20 mM HEPES pH 7.5 and 50 mM NaCl, mixed in equimolar concentrations, heated up to 95 °C for 5 min and gradually cooled down to 10 °C over 10 h (Extended Data Fig. 4).
IstB–target DNA complex formation and vitrification
To obtain the complex of IstB bound to a target DNA, wild-type IstB was mixed with a random 60-mer duplex (5′-TGCTTGCGATGATCCGACGTGTTAGCCACGCTGACTAGTTATGCCATGCCTCCCTTCAGG-3′) with a 6.5:1 molar ratio and dialysed overnight at 4 °C against 20 mM HEPES pH 7.5, 150 mM NaCl, 5 mM MgCl2, 5% glycerol, 1 mM ATP and 1 mM DTT. The sample was then loaded onto a Superdex 200 Increase 5/150 GL equilibrated in dialysis buffer, at room temperature. The fractions corresponding to the peak containing the IstB–DNA complex were aliquoted, flash-frozen with liquid nitrogen and stored at −80 °C.
For the cryo-EM experiments, the sample was diluted in 20 mM HEPES pH 7.5, 150 mM NaCl, 5 mM MgCl2, 1 mM ATP, 1 mM DTT and 0.015% NP-40 (to try to favour the formation of a thin and uniform ice layer). Next, 3 μl of the IstB–DNA complex (about 2 μM) was applied to glow-discharged Quantifoil Gold 2:1, 300 mesh grids with a home-made continuous carbon coating, blotted for 4 s (blot force of +25) and frozen in liquid ethane using a Vitrobot Mark IV plunging system (Thermo Fisher Scientific).
IstA–IstB–STC holo-transpososome complex formation
To obtain the holo-transpososome complex, numerous buffer conditions and protein-to-DNA stoichiometries were initially tested using gel filtration chromatography and negative-staining EM. Finally, IstA was mixed with the STC DNA in a 3.3:1 (protein to DNA) molar ratio and diluted in STC buffer (20 mM HEPES pH 7.5, 150 mM NaCl, 5 mM MgCl2, 5% glycerol, 1 mM ATP and 1 mM DTT) until IstA was present at 2.4 μM. The sample was then incubated for 45 min at 37 °C, mixed with 12 μM IstBE167Q in STC buffer (IstB to IstA final 5:1 molar ratio), and incubated for 30 min at 37 °C.
For negative-stain EM, the holo-transpososome was further diluted in STC buffer. Next, 4 μl (1 μM IstB referred to IstB monomer) of the sample was applied to a glow-discharged continuous carbon-coated grid (Electron Microscopy Sciences), incubated for 1 min and washed twice in 50 μl drops of MilliQ water before being incubated for 1 min in 2 × 10 μl droplets of 2% uranyl acetate stain. Images were collected using a JEOL 1230 microscope equipped with a TemCam-F416 camera (TVIPS), at a magnification of ×40,000, corresponding to a pixel size of 2.8 Å per pixel. They were imported to RELION-4.0 (refs. 56,57) and the contrast transfer function (CTF) was estimated using CTFFIND (v.4.1)58. Micrographs were picked using Topaz and subjected to 2D classification using RELION-4.0 (refs. 56,57,59).
For cryo-EM grid preparation, the sample was diluted in 20 mM HEPES pH 7.5, 150 mM NaCl, 5 mM MgCl2, 1 mM ATP, 1 mM DTT and 0.015% NP-40. Next, 3 μl of the transpososome complex (1.5 μM IstB referred to IstB monomer) was applied to glow-discharged Quantifoil Gold 2:1, 300 mesh grids coated with a second layer of homemade thin continuous carbon. After 1 min of incubation, grids were blotted for 4 s (blot force of +25) and frozen in liquid ethane using a Vitrobot Mark IV (Thermo Fisher Scientific).
Cryo-EM data collection, image processing and atomic model building of IstB–target–DNA complex
Cryo-EM grids were pre-screened in a JEOL 1230 microscope and in a FEI Talos Artica microscope equipped with a TemCam-F416 (TVIPS) and a Falcon III camera, respectively (Thermo Fisher Scientific). High-resolution data of the IstB complex with target DNA were collected on a Titan Krios electron microscope operated at 300 kV (Diamond Light Source). Imaging was performed with EPU at a nominal magnification of ×81,000 (calibrated physical pixel size of 1.06 Å per pixel; super-resolution of 0.53 Å per pixel) using a Gatan K3 BioQuantum direct electron detector operating in super-resolution counting mode. The nominal defocus range for the dataset extended from −1.2 μm to −2.7 μm in 0.3 μm increments. Each movie was recorded during 5 s and fractioned in 50 frames. The dose rate was 1.2 e– per Å2 per frame, resulting in an accumulated exposure of 60 e– Å–2 (Extended Data Table 1).
A total of 6,214 movies were imported into RELION-3.0 (refs. 57,60), motion-corrected and electron-dose-weighted with MOTIONCOR2 (ref. 61) (Extended Data Fig. 2). The CTF was estimated using GCTF62. They were manually curated to remove patchy or cristaline ice, obtaining a subset of 3,812 micrographs. A subset of the micrographs was picked with CRYOLO63, binned by two and subjected to 2D classification. The resulting 2D averages were then used as templates to pick the entire dataset with RELION-3.0 (ref. 57). A total of 2,439,865 particles were extracted and downsampled to 2.12 Å per pixel. After 2D classification, 1,451,514 particles were selected and subjected to 3D classification using C1 symmetry, which separated full pentamers of dimers from incomplete, tetrameric complexes. The 306,745 particles from the best class were extracted with the original pixel size of 1.06 Å per pixel and used as input for a subsequent 3D refinement, run using C2 symmetry and a soft-edged mask that followed the contour of the particle, which resulted in a 3.35 Å resolution map. The particles where subjected to CTF refinement and Bayesian polishing, generating the final 3.2 Å resolution map.
To improve the density corresponding to IstB N-terminal domains, the dataset was subjected to focused 3D classification (Extended Data Fig. 2g,h). First, symmetry expansion was applied to the final 306,745 particles. A mask was then created around the N-terminal domains of the second dimer using Chimera (v.1.14)64. This particular dimer was selected because of its central location and, therefore, probably more rigid configuration. The density of the remaining decamer was subtracted from the particles and the selected N-terminal domains were subjected to a round of 3D classification without re-aligning the particles (C1 symmetry, T = 1,000). One of the resulting maps, which showed continuous and clear density for the polypeptide chain, was used for modelling this region.
A previously determined crystal structure of the IstB ATPase domain (PDB identifier 5BQ5)27 was rigid-body docked into one of the monomers of the cryo-EM density map obtained from the 3D refinement. The N-terminal domain was manually de novo built using both the map of the full decamer and that obtained from the focused classification as references. The complete monomer was then subjected to a round of model building and real space refinement with COOT and PHENIX-1.19 using Ramachandran, rotamer, geometry and secondary structure restraints65,66. The refined monomer was then docked into the four remaining positions to generate the asymmetric unit and subjected to an additional round of model building and real space refinement. The full decamer was obtained applying C2 symmetry operators to the refined unit. IstB has no clear sequence specificity and the density for the nucleic acid appeared to be less defined than the protein probably as a consequence of DNA sliding. The density, however, clearly showed the position of the characteristic mayor and minor grooves. iMODFIT (v.1.2) was used to generate a curved DNA duplex that followed the density map67. The complete model was obtained after some final interactive rounds of real space refinement and validation with PHENIX and MOLPROBITY using Ramachandran, rotamer, geometry, secondary structure, base planarity and base stacking restraints and NCS constraints.
Cryo-EM data collection, image processing and atomic modelling of the IstA–IstB–STC holo-transpososome complex
Cryo-EM grids were screened in a JEOL 1230 microscope and in a FEI Talos Artica microscope equipped with a TemCam-F416 (TVIPS) and a Falcon III camera, respectively (Thermo Fisher Scientific). High-resolution data of the holo-transpososome complex were collected from two grids (obtained under identical biochemical and vitrification conditions) on a Titan Krios electron microscope operated at 300 kV (BREM Biofisika, Bilbao). Imaging was performed using EPU at a nominal magnification of ×105,000 (calibrated physical pixel size of 0.82 Å per pixel; super-resolution of 0.41 Å per pixel) with a Gatan K3 BioQuantum direct electron detector operating in super-resolution counting mode. Each movie was recorded during 2.1 s in 50 frames with a nominal defocus range of −1 μm to −2.6 μm (increments of 0.3 μm). The dose rate was 1 e− Å–2 per frame, resulting in an accumulated exposure of 51.3 e− Å–2 (Extended Data Table 1). Overall, 15,206 and 21,070 movies were acquired from the two grids.
The movies from each grid were initially pre-processed independently. They were imported into RELION-4.0 (refs. 56,57), motion-corrected and electron-dose-weighted with MOTIONCOR2 (ref. 61), and the CTF was estimated using CTFFIND (v.4.1)58. Micrographs were picked using Topaz59. From the first grid, a total of 1,395,290 particles were extracted and downsampled to 1.98 Å per pixel (Extended Data Fig. 5). After 2D classification, 167,791 particles were selected, re-extracted with a pixel size of 0.98 Å per pixel and used as input for a subsequent 3D classification using a soft-edged mask. The initial 3D classification, using C1 symmetry, identified various oligomeric states of IstB in some of the models that seemed to result from the detachment of IstA-distant flexible dimers (in line with findings from the isolated ATPase complexed with target DNA). However, the three most IstA-proximal IstB dimers from each oligomer consistently exhibited greater rigidity and homogeneity. Consequently, using C2 symmetry for 3D classification and subsequent refinement steps produced similar results, with enhanced resolution and map quality in the central region of the complex. The 140,441 particles from the best classes were then subjected to 3D refinement, run using a soft-edged mask that followed the contour of the particle, and subjected to CTF refinement and Bayesian polishing that allowed us to generate a 4.3 Å resolution map. From the second grid, a total of 2,707,350 particles were extracted and downsampled to 1.98 Å per pixel. After 2D classification, 212,669 particles were selected and subjected to 3D refinement with a soft-edged mask. They were then re-extracted with a pixel size of 0.98 Å per pixel and subjected to CTF refinement and Bayesian polishing, resulting in a 3.65 Å resolution map. The CTF-refined and Bayesian-polished particles from both grids were then pooled together (353,110 total particles) and used as input for a subsequent 3D classification, using C2 symmetry and a tight mask. Overall, 272,218 particles from the best classes were used for a last round of 3D refinement, using C2 symmetry and a soft-edged mask, to obtain a 3.62 Å resolution map.
To improve the density map around relevant elements, a focused 3D refinement of the core (removing the flexible regions) was performed using RELION-4.0 (refs. 56,57). First, a soft-edged mask created following the contour of the core was used to do a particle subtraction on the 272,218 particles. Subsequently, a focused 3D-refinement (C2 symmetry) produced an improved EM density at 3.26 Å resolution (Extended Data Fig. 8).
Monomers of IstA (PDB identifier 8B4H)21 and IstB (obtained previously in the complex of IstB–target DNA) were initially fitted as a rigid body into the unsharpened maps and later manually modelled in COOT65 using the sharpened maps for the fitting of the lateral chains. The asymmetric unit of the complex, which contained two monomers of IstA, ten molecules of IstB and three strands of DNA (chains A and C, chains E to N and chains a, b and c, respectively) was improved by alternating rounds of model building and real space refinement with COOT and PHENIX (v.1.20)65,66, applying rotamer, Ramachandran, secondary structure and geometry restraints for the protein, and stacking, hydrogen bonds and base-pair parallel planes restraints for the DNA (generated using LIBG and ProSMART tools from CCP4 package)68. The complete model was generated by imposing C2 symmetry operators to the asymmetric unit, resulting in a macromolecular assembly composed by four chains of IstA (A–D), 20 monomers of IstB (E–X) and 6 strands of DNA (a–f). The STC was generated as duplex DNA using GRAPHITE-LIFE EXPLORER69. The symmetrized model was subjected to interactive rounds of real space refinement, applying NCS constraints, and validation performed with PHENIX-1.20, COOT, MolProbity and the PDB validation tool (OneDep: https://validate-rcsb-1.wwpdb.org)65,66. Figures were generated using Chimera (v.1.15) and ChimeraX (v.1.5)64,70.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.