Two-factor authentication underpins the precision of the piRNA pathway

Mouse strains and experimentation

The Spocd1^HA and Miwi2^tdTomato (Miwi2^tdTom) mouse alleles have been described previously^4,31. Miwi2^tdTom is a Miwi2 null allele and is used as such³¹. Both lines were kept on a mixed B6CBAF1/Crl;C57BL/6 N;Hsd:ICR (CD1) genetic background. The Spocd1^ΔSPIN1 allele was generated by CRISPR–Cas9 gene editing as previously described^32,33. A single guide RNA (sgRNA) (GGGTCAGGAATCAGGCTTGT) together with Cas9 mRNA and a single-stranded DNA oligonucleotide containing the eight-alanine mutation flanked by 85 base pairs (bp) of homology arm (AGATGGTAAACAGTTGAAGCCAAGGCAGGGAGGATTTCAGGCAGAGCCTTGCCATACTCTCTCTCAGCAGGTCTACACTGGGTCAGCTGCCGCAGCGGCCGCTGCCGCCGCTGCAAGTCAGCCAGGACAAATTGAACCTCTGGAGGAGTTGGACACCAACTCAGCCAGAAGGAAGAGAAGGCCCACAACTGCTCACCCTA) was injected into the cytoplasm of fertilized single-cell zygotes (B6CBA F1/Crl). F₀ offspring were screened by PCR and the Spocd1^ΔSPIN1 allele was confirmed by Sanger sequencing. The allele was established from one founder animal and back-crossed several times to a C57BL/6N genetic background. The Spocd1^ΔSPIN1 mice were thus on a mixed B6CBAF1/Crl;C57BL/6N genetic background. Animals were genotyped using a PCR of four primers (F, GACCCTGTATTTATTGAAGTCACTG; R, CCTCAGTGACATCAGGCGGA; WT-F, CACTGGGTCAGGAATCAGGC; and ∆Spin-R, GTCCTGGCTGACTTGCAGC). Mice carrying the Oct4^eGFP reporter allele³⁴ were originally obtained from Jackson Laboratories (B6;129S4-Pou5f1^tm2Jae/J (Oct4-eGFP), stock number 008214).

Male fertility was assessed by mating studs to Hsd:ICR (CD1) wild-type females and counting the number of pups born for each plugged female. For each experiment, animal tissue samples were collected from one or more litters and allocated to groups according to genotype. No further randomization or blinding was applied during data acquisition and analysis.

Animals were maintained at the University of Edinburgh, UK, in accordance with the regulation of the UK Home Office, or at the Institute for Molecular Biology in Mainz, Germany, in accordance with local and European animal-welfare laws. Ethical approval for the UK mouse experimentation has been given by the University of Edinburgh’s Animal Welfare and Ethical Review Body and the work done under licence from the UK Home Office. Animal experiments done in Germany were approved by the ethical committees on animal care and use of the federal states of Rheinland-Pfalz, Germany, covered by LUA licence G 23-5-049.

Immunofluorescence

Immunofluorescence experiments were done as previously described³⁵. The following primary antibodies were used in this study: anti-HA (Cell Signaling Technologies) 1:200; anti-LINE1-ORF1p (ref. ³⁶) 1:500; anti-IAP-GAG (a gift from B. Cullen, Duke University) 1:500; anti-γH2AX (Bethyl Laboratories) 1:500; anti-MIWI2 (a gift from R. Pillai, Université de Genève) 1:500; anti-SPOCD1 rabbit serum rb175 1:500 (O’Carroll laboratory antibody); anti-SPIN1 (Cell Signaling Technologies) 1:500 (of a custom preparation of 1.1 μg μl⁻¹ in PBS). Images were taken on a Zeiss Observer or Zeiss LSM880 with an Airyscan module. Images acquired using the Airyscan module were deconvoluted with the Zeiss Zen software ‘Airyscan processing’ with settings 3D and a strength of 6. ImageJ and Zeiss Zen software were used to process and analyse the images.

Cell culture, transfection, immunoprecipitation and western blotting

HEK293T cells (O’Carroll laboratory stock, not further authenticated, tested for mycoplasma contamination) were cultured and transfected as previously described⁴ with a minor modification, and 3 μl Jetprime reagent was used. On day 2 after transfection, cells were washed twice with PBS and resuspended in 1 ml lysis buffer (IP buffer: 150 mM KCl, 2.5 mM MgCl₂, 0.5% Triton X-100, 50 mM Tris-HCl, pH 8, supplemented with 1× protease inhibitors (cOmplete ULTRA EDTA-free, Roche) with 37 units per ml benzonase (Millipore)) and lysed for 30 min, rotating at 4 °C. The lysate was cleared by centrifugation for 10 min at 21,000g. Cleared lysate (800 μl) was incubated with 20 μl of anti-HA beads (Pierce) that had been calibrated in lysis buffer and incubated for 1 h at 4 °C on a rotating wheel. The beads were washed four times with lysis buffer. Immunoprecipitates were eluted at 50 °C for 10 min in 20 μl 0.1% sodium dodecyl sulphate (SDS), 50 mM Tris-HCl, pH 8. Lysates and eluates were run on a 4–12% bis–tris acrylamide gel (Invitrogen) and blotted onto a nitrocellulose membrane (Amersham Protran 0.45 NC) according to standard laboratory procedures. The membrane was blocked with blocking buffer (4% (w/v) skimmed milk powder (Sigma-Aldrich) in PBS-T (phosphate buffered saline, 0.1% Tween-20)) and subsequently incubated for 1 h with primary antibodies (anti-HA (C29F4s, Cell Signaling Technologies), 1:1,000; anti-FLAG (M2, Sigma-Aldrich) 1:1,000, anti-SPOCD1 rabbit serum rb175 (O’Carroll laboratory antibody) 1:500 or anti-α-Tubulin (T9026, Sigma-Aldrich) 1:1,000) in blocking buffer. The anti-α-tubulin staining was used as loading control on the same blot as the experimental staining. After three PBS-T washes for 10 min, the membrane was incubated with secondary antibodies (IRDye 680RD donkey anti-rabbit or IRDye 800CW donkey anti-mouse, LI-COR, 1:10,000) in blocking buffer for 1 h. It was washed three times for 10 min in PBS-T and imaged on a LI-COR Odyssey CLx system. Exposure of the entire images was optimized in Image Studio Lite (LI-COR), and areas of interest were cropped for presentation.

Protein alignments and structure prediction

The mouse SPOCD1 AlphaFold2 protein structure prediction model^22,23 was downloaded from the AlphaFold Protein Structure Database (https://www.alphafold.ebi.ac.uk/). Models for the SPOCD1–SPIN1 interaction, as well as the single SPOCD1 proteins from Anolis, Xenopus and Latimeria, were generated with AlphaFold2 (refs. ^22,23) on ColabFold³⁷. The model was visualized using PyMol³⁸. Multiple sequence alignments of SPOCD1 and SPIN1 were generated with ClustalW³⁹ and edited in Jalview⁴⁰. For SPOCD1, alignments were edited based on secondary-structure elements of the AlphaFold2 model (B1ASB6) using Jalview⁴⁰.

Protein purification

GST-tagged mouse SPOCD1 fragments (amino acids 203–409), Anolis SPOCD1 fragments (XP_008116112.1, amino acids 457–748), Xenopus SPOCD1 fragments (XP_031752218.1, amino acids 1–229), Latimeria SPOCD1 fragments (XP_014348336.1, amino acids 510–1009) and His-tagged SPIN1 (amino acids 49–262) were cloned in a pET-based backbone. Proteins were expressed in Escherichia coli BL21 (DE3). Bacteria were grown in 2xTY media at 37 °C until an optical density of 0.8 was reached. Then, the temperature was reduced to 18 °C, the bacteria were induced with 1 mM IPTG and grown for another 14–16 h. Cells were collected and pellets were stored at −80 °C until purification. The pellets were resuspended in 50 ml lysis buffer (20 mM Tris-HCl, pH 7.5, 200 mM NaCl, 2.5 mM imidazole, 0.5 mM β-mercaptoethanol, Roche cOmplete EDTA-free Protease Inhibitor Cocktail, 0.01 mg ml⁻¹ DNaseI (Sigma) and 2 mM AEBSF (Pefabloc) for SPIN1, or 20 mM Tris-HCl, pH 7.5, 200 mM NaCl, 1 mM DTT, Roche cOmplete EDTA-free Protease Inhibitor Cocktail, 0.01 mg ml⁻¹ DNaseI (Sigma) and 2 mM AEBSF (Pefabloc) for SPOCD1) and cells were lysed with the Constant systems 1.1 kW TS cell disruptor at 25 kPSI. The cleared lysate was used to load on a cOmplete His-Tag Purification Column (Roche) for SPIN1 or incubated with 7 ml glutathione sepharose high-performance beads (Cytiva) for SPOCD1 calibrated in the respective buffer. Elution from column/beads with increasing (2.5–500 mM) imidazole gradient for SPIN1 or GST elution buffer containing 20 mM reduced glutathione for SPOCD1. The fractions of interest were pooled and dialysed overnight in 20 mM Tris-HCl, pH 7.5, 100–150 mM NaCl, 1 mM DTT. The SPIN1 construct was cleaved with GST–3C protease (made in our lab) overnight. The SPOCD1 constructs were concentrated and stored at −80 °C until used. SPIN1 was further purified by ion exchange with a gradient of 100–1,000 mM NaCl (Resource Q, Cytiva) and size-exclusion chromatography (HiLoad 16/600 Superdex 200 pg, Cytiva). Finally, the protein was concentrated and stored at −80 °C until used.

Nucleosome pull-downs with recombinant SPIN1-SPOCD1 proteins

Histone H3 site-specifically modified with H3K4me3 and/or H3K9me3 was generated by native chemical ligation (NCL) and assembled into nucleosomes as described previously^41,42. In brief, Xenopus H3 and H4 and human H2A and H2B were expressed in E. coli and purified from inclusion bodies. For NCL, a tail-less histone H3 lacking residues 1–31 and containing a threonine-to-cysteine substitution at position 32 and a cysteine-to-alanine substitution at position 110 of Xenopus H3 (H3Δ1–31T32C C110A) was expressed in E. coli and purified in the same way. NCL reactions were carried out with synthetic carboxy-terminal benzyl thioester peptides spanning residues 1–31 of histone H3.1 and carrying the desired modifications at K4 and K9 (Peptide Protein Research) in 6 M guanidine HCl, 250 mM sodium phosphate buffer, pH 7.2, 150 mM 4-mercaptophenylacetic acid (MPAA, Sigma) and 50 mM TCEP for 72 h at room temperature. Ligated full-length modified histone H3 was purified through cation-exchange chromatography on a HiTrap SP column (Cytiva). Histone octamers were reconstituted by dialysis and purified by gel filtration on an S200 size-exclusion column (Cytiva). For the generation of trans-histone octamers carrying H3K4me3 and H3K9me3 on separate copies of histone H3, the H3X–H3Y system was used⁴³, starting from H3Δ1–31T32C C110A constructs that also contained the required H3X and H3Y mutations. H3X was used for H3K4me3 and H3Y for H3K9me3. A biotinylated 209-bp DNA fragment containing the 601 nucleosome positioning sequence was generated by PCR and purified by ion-exchange chromatography on a HiTrap Q column followed by ethanol precipitation. Mononucleosomes were then assembled from histone octamers and 601 DNA by gradient dialysis. Nucleosome assembly was verified by native gel electrophoresis on 6% acrylamide gels in 0.5× TGE buffer (12.5 mM Tris, pH 8.0, 95 mM glycine and 0.5 mM EDTA).

Nucleosome pull-down assays were done essentially as described previously⁴⁴. All incubations and washes were performed at 4 °C with end-over-end rotation, and all centrifugation steps were done at 1,500g for 2 min at 4 °C. Then, 23 pmol (3 µg) of recombinant, site-specifically modified nucleosomes were bound to streptavidin sepharose high-performance beads (Cytiva) by overnight incubation in pull-down buffer (20 mM HEPES, pH 7.9, 175 mM NaCl, 10% glycerol, 1 mM EDTA, 1 mM DTT, 0.1% NP-40, 0.1 mg ml⁻¹ BSA). Before incubation, beads were blocked with 1 mg ml⁻¹ BSA in pull-down buffer. Nucleosome-bound beads were washed three times with pull-down buffer before incubation with recombinant SPIN1 and SPOCD1 proteins for 2 h. His-tagged SPIN1 (49–262) and His-tagged SPOCD1 fragment 1b were expressed and purified as above. SPIN1–SPOCD1 fragment 1b complexes were purified by size-exclusion chromatography on an S200 increase column (Cytiva) as above. For the experiment shown in Fig. 1j, 23 pmol of protein was used. After incubation with recombinant proteins, beads were washed three times with high-salt pull-down buffer (as above but with 350 mM NaCl) for 5 min. Nucleosomes and bound proteins were eluted by boiling in 1.5× SDS sample buffer (95 mM Tris HCl, pH 6.8, 15% glycerol, 3% SDS, 75 mM DTT, 0.15% bromophenol blue). Binding was analysed by western blotting with antibodies against His tag (Sigma H1029, lot 033m4785) 1:1,000. Antibodies against histone H3 (Abcam ab176842, lot GR1494741-36) 1:2,500, H3K4me3 (Cell Signaling) 1:2,000 and H3K9me3 (Abcam ab176916) 1:1,000 were used to verify nucleosome loading and modification state.

Analytical size-exclusion chromatography

For analytical size-exclusion chromatography, 125 μg SPIN1 and/or 500 μg mouse GST–SPOCD1-F1b were used for each run. Proteins were diluted in 250 μl size-exclusion chromatography buffer (20 mM HEPES, pH 7.5, 150 mM NaCl, 1 mM DTT) and injected on a Superdex 200 10/300 GL column. Peak fractions were collected, loaded on an SDS–PAGE gel and visualized by Coomassie staining.

Crosslinking mass-spectrometry analysis

Recombinant fragments (25 μg) of SPOCD1 (GST–F1b) and SPIN1 were incubated in 20 mM HEPES, pH 7.5, 150 mM NaCl, 1 mM DTT and crosslinked with BS3 (bis(sulfosuccinimidyl)suberate) (Thermo Fisher Scientific) at BS3:protein ratios of 1:1, 2:1 and 4:1 (w/w) for 2 h on ice. The crosslinking reaction was stopped by adding 2 μl ammonium bicarbonate (2.0 M). Crosslinking products were run on 4–12% bis-Tris NuPAGE (Invitrogen) for 15 min and briefly stained using Instant Blue (Expedeon). Bands at more than 150 kD were excised and the proteins were reduced with 10 mM DTT for 30 min at room temperature, alkylated with 55 mM iodoacetamide for 20 min at room temperature and digested using 13 ng μl⁻¹ trypsin (Promega) overnight at 37 °C³⁷. The digested peptides were loaded onto C18-Stage-tips³⁸ for liquid chromatography with tandem mass spectrometry (LC-MS/MS) analysis. The LC-MS/MS analysis was performed using Orbitrap Fusion Lumos (Thermo Fisher Scientific) with a ‘high/high’ acquisition strategy. The peptide separation was done on an EASY-Spray column (50 cm × 75 μm internal diameter, PepMap C18, 2-μm particles, 100 Å pore size; Thermo Fisher Scientific). Mobile phase A consisted of water and 0.1% (v/v) formic acid. Mobile phase B consisted of 80% (v/v) acetonitrile and 0.1% (v/v) formic acid. Peptides were loaded at a flow rate of 0.3 μl min⁻¹ and eluted at 0.25 μl min⁻¹ using a linear gradient going from 2% mobile phase B to 40% mobile phase B over 102 or 132 min (each sample was run twice with different gradients), followed by a linear increase from 40% to 95% mobile phase B in 11 min. The eluted peptides were introduced directly into the mass spectrometer. MS data were acquired in the data-dependent mode with a 3 s acquisition cycle. Precursor spectra were recorded in the Orbitrap with a resolution of 120,000 and a mass-to-charge ratio (m/z) range of 350–1,700. Ions with a precursor charge state between 3+ and 8+ were isolated with a window size of m/z = 1.6 and fragmented using high-energy collision dissociation with a collision energy of 30. The fragmentation spectra were recorded in the Orbitrap with a resolution of 15,000. Dynamic exclusion was enabled with single repeat count and 60 s exclusion duration. The mass-spectrometric raw files were processed into peak lists using ProteoWizard (v.3.0)³⁹ and crosslinked peptides were matched to spectra using Xi software (v.1.7.6.4)⁴⁰ with in-search assignment of mono-isotopic peaks⁴¹. Search parameters were: MS accuracy, 3 ppm; MS/MS accuracy, 5 ppm; enzyme, trypsin; crosslinker, BS3; maximum missed cleavages, 4; fixed modification, carbamidomethylation on cysteine; variable modifications, oxidation on methionine; fragments b and y ions with loss of H₂O, NH₃ and CH₃SOH. The linkage specificity for BS3 was assumed to be at lysine, serine, threonine, tyrosine and protein N termini. Identified candidates of crosslinked peptides were validated by Xi software⁴⁰, and only auto-validated crosslinked peptides were used. Identified crosslinks underlying Fig. 2b are shown in Supplementary Table 1.

ChIP sequencing analysis

Raw fastq.gz sequencing files for ChIP-seq of H3K4me3 and H4K9me3 were downloaded from the Sequence Read Archive record SRP165187 (ref. ²⁴). Paired-end reads were preprocessed to remove adapter sequences and trim low-quality bases using Trimmomatic v.0.35 (ref. ⁴⁵). Tru-seq adapter sequences were used in the case of ChIP-seq samples. Trimmed reads were aligned to the mouse mm10 genome with bwa mem v.0.7.16 (ref. ⁴⁶) using the -M parameter. Alignments were filtered to remove duplicate reads with Picard MarkDuplicates v.2.24.0 (http://broadinstitute.github.io/picard/) and improper alignments with Samtools view v.1.11 -F 260 -f 3 (ref. ⁴⁷). In the case of multi-mapping reads, a single alignment (marked as primary by bwa) was selected for downstream analysis. BAM files were converted to normalized bigWig files for visualization and plotting using deepTools⁴⁸ bamCoverage v.3.5.0 with the following parameters: -bs 1 –normalizeUsing BPM.

ChIP heatmaps and average profile plots

Genomic annotations for repetitive elements L1Md_A, L1Md_T, L1Md_F (combining elements classified as L1MD_F, L1Md_F2, L1Md_F3), L1Md_Gf, IAPEy and MMERVK_10C were extracted from Repeat Masker using the UCSC table browser. Normalized read coverage was computed across these elements using deepTools v.3.5.0 computeMatrix. The central regions were length-normalized to 5 kb with flanking regions ±2 kb from the start and end positions. Heatmaps were drawn using deepTools v.3.5.0 plotHeatmap, separating each repetitive element and sorting rows in descending order of total signal. LINE1 elements (L1Md_A, L1Md_F and L1Md_T) were further separated into young LINE1 elements based on a divergence of 38 bases per kb or less from a consensus sequence⁴ or the presence of an intact functional promoter denoted by the presence of specific monomer annotations⁴⁹. Monomers associated with inert promoters (subtypes 6 and 2) were removed from the analysis. Average profiles were generated for each experiment and each category of repetitive element by calculating the mean signal between replicate samples. Computations were performed in R, with the seqplots package⁵⁰, using bins of 50 bases, flanking regions of 2 kb and a central-region length normalized to 5 kb. Final plots were drawn and formatted using the tidyverse packages⁵¹.

IP-MS

IP-MS of SPOCD1–HA from Spocd1^HA/+ E14.5 fetal testis using 50 μl of anti-HA beads (Pierce, 88837) was done as previously described⁴, with a reduced number of 25 testes per replicate. Wild-type fetal testes were used as controls.

Fluorescence-activated cell sorting (FACS)

To purify foetal germ cells for CUT&Tag analysis, E14.5 testes were dissected from embryos carrying the Oct4^eGFP allele³⁴. A single cell suspension was obtained by sequential treatment with 100 µl collagenase solution at 37 °C for 8 min (10 units of collagenase A (Sigma-Aldrich 10103578001); 2× NEAAs (Gibco); 2× Na-pyruvate (Gibco); 25 mM HEPES–KOH, pH 7.5) and 200 µl TryPLE Express (Gibco) at 37 °C for 5 min with gentle flicking and pipetting of the solution to aid dissociation. Digestion was neutralized by 70 µl prewarmed FBS and cells were collected by spinning at 600g for 4 min at room temperature followed by two washes in FACS buffer (1× PBS; 2 mM EDTA, 25 mM HEPES-KOH, pH 7.5, 1.5% BSA, 10% FBS; 2 µg ml⁻¹ DAPI) and filtering (Corning, 352235) just before sorting. Cell sorting was done on an Invitrogen Bigfoot using a 100 μm nozzle and gating for DAPI-negative (live), OCT4–eGFP-positive (germ cells) populations into collection tubes containing 100 µl 1× PBS.

For EM-seq, CD9⁺ spermatogonia were sorted from P14 testes as described previously⁵² using Fc block (eBioscience, 14-0161-86, clone 93, lot 2297433) 1:50; biotin-conjugated anti-CD45 (eBioscience, 13-0451-85, clone 30-F11, lot 2349865) 1:400, and biotin-conjugated anti-CD51 (Biolegend, 104104, clone RMV-7, lot B308465) 1:100 anti-CD9^APC (eBioscience, 17-0091-82, clone eBioKMC8, lot 2450733) 1:200, anti-cKit^PE-Cy7 (eBioscience, 25-1171-82, clone 2B8, lot 2191977) 1:1,600, streptavidin^V450 (BD bioscience, 560797, lot 1354158) 1:400 and 1 μg ml⁻¹ DAPI. Cells were sorted into DMEM media on a BD Aria II sorter, pelleted for 5 min at 500g and snap frozen in liquid nitrogen.

For gating strategies, see Supplemental Fig. 2.

CUT&Tag assays

CUT&Tag was done on FACS-isolated fetal germ cells as previously described²⁶, with some minor modifications. First, 10,000 to 20,000 germ cells were bound to 10 µl concanavalin A-coated beads (Polysciences, 86057-10). After binding to beads, cells were fixed with 0.2% formaldehyde for 2 min followed by quenching with glycine (125 mM) and washed with Dig-Wash buffer while separated on the magnet. The remaining steps were as previously described²⁶, using pA–Tn5 at a 1:400 dilution (Diagenode, C01070001) and 15 PCR cycles of library amplification. Libraries were cleaned up by magnetic bead-based solid-phase separation and assessed on a Tapestation (Agilent). Antibodies and dilutions used for CUT&Tag were rabbit IgG control (Abcam, ab37415, lot GR3219601-1) at 1:50, rabbit anti-SPIN1 (Cell Signaling, 89139S, lot 2) at 1:50, rabbit anti-H3K4me3 (Merck-Milipore, 07-473, lot 403371) at 1:50, rabbit anti-H3K9me3 (Abcam, ab8898, lot GR27111-1) at 1:50, and guinea pig anti-rabbit IgG (Antibodies Online, ABIN101961, lot NE-200-032309) at 1:100. Pooled libraries were sequenced using paired-end 150 bp on a NextSeq 2000 instrument (Illumina).

CUT&Tag analysis

First, 150b and 155b paired-end CUT&Tag sequencing reads were processed and aligned to the mouse-genome assembly (version GRCm38) using the NF-core (https://doi.org/10.5281/zenodo.7715959) CUT&RUN Nextflow pipeline version 3.1 (ref. ⁵³). The pipeline performed adapter trimming with Trim Galore (https://doi.org/10.5281/zenodo.5127898) and reference-genome alignment with Bowtie2 (ref. ⁵⁴). Multimap reads were included using the parameter –minimum_alignment_q_score 0. The pipeline performed further filtering of reads to report only properly paired primary alignments and remove alignments to GRCm38 blacklisted regions. The default for the pipeline is to remove only duplicate reads (alignments that share common start and end points) from IgG controls. However, after further assessment of the sequence duplication rates in all samples, we decided to perform read deduplication on the SPIN1 replicate samples. Deduplication of SPIN1 samples was performed using Picard MarkDuplicates v.2.24.0 (http://broadinstitute.github.io/picard/) with the parameter –REMOVE_DUPLICATES. Individual replicates from each sample were then merged into a single BAM file using Samtools merge v.1.11 (ref. ⁴⁷) for downstream analysis. Normalized bigWig files of read coverage were generated with deepTools bamCoverage v.3.50 (ref. ⁴⁸), using the following parameters: -bs 1 –normalizeUsing CPM —exactScaling –ignoreForNormalization MT. Log₂ enrichment profiles of CUT&Tag samples over IgG controls were generated with deepTools bamCompare using the following parameters: -bs 1 –normalizeUsing CPM –exactScaling –ignoreForNormalization MT –scaleFactorsMethod None.

Log₂ enrichment profiles of CUT&Tag versus IgG control over various classes of repetitive elements (L1Md_A, L1Md_F, L1Md_Gf, L1Md_T, IAPEy-int and IAPEz-int) were plotted as heatmaps and average profiles, using computeMatrix from the deepTools⁴⁸ package and the profilePlyr⁵⁵ R package to include annotations of peak overlaps. Positions of repetitive elements were extracted from a table of mouse mm10 repeatMasker annotations downloaded from the UCSC table browser and filtered for elements greater than 5 kb in length. LINE1 elements (L1Md_A, L1Md_F, L1Md_T) were further separated into young LINE1 elements based on a divergence of 38 bases per kb or less from a consensus sequence⁴ or the presence of an intact functional promoter denoted by the presence of specific monomer annotations⁴⁹. Monomers associated with inert promoters (subtypes 6 and 2) were removed from the analysis. The central regions of repetitive elements were length-normalized to 5 kb with flanking regions ±2 kb from the start and end positions. Heatmaps and profile plots show data in consecutive 10b bins with regions subdivided by elements and arranged in descending order of total enrichment across all samples.

Peak calling was done using MACS2 callpeak⁵⁶ on individual replicates as well as all replicates together, with IgG samples set as a control. The parameter –keep-dup all was used to include duplicate reads, when present, in the peak calling model. To attain a set of high-confidence peaks, we selected peaks with a minimum coverage of 20 reads in the CUT&Tag sample and a peak score greater than the mean peak score. Peaks of co-localized H3K4me3 and H3K9me3 binding were attained by finding the intersection of both peak sets using the GenomicRanges R package⁵⁷. Peak sets were overlapped with annotations to provide a breakdown of their intersection with specific genomic features, with each peak assigned to a single classification in the following hierarchy: LINEs, other repetitive elements, genes and intergenic. LINEs included all repeatMasker annotations included in the LINE class. Other repetitive elements included repeatMasker annotations in the classes LTR, Simple_repeat, Satellite, ERVK and Retrotransposon. Genes were defined as any coding or non-coding transcriptional unit plus 500 bases upstream, based on the ENSEMBL gene annotations GRCm38 v.79. Overlaps of peaks with genomic features was performed using the GenomicRanges R package⁵⁷.

Downstream data analysis and plotting was predominantly performed using the R programming language (R Core Team, 2021, https://www.R-project.org/) and the Tidyverse libraries⁵¹. Genome snapshots and data tracks were prepared using pyGenomeTracks⁵⁸.

Histology of mouse samples

Histology experiments on mouse samples were done as previously described⁴.

TUNEL assay

TUNEL assay experiments were done as previously described⁴.

RNA sequencing and analysis

RNA sequencing experiments and analysis were done as previously described⁴ with data for Spocd1^−/− downloaded from GSE131377 (ref. ⁴).

Whole-genome methylation sequencing and analysis

Whole-genome methylation sequencing of DNA derived from Spocd1^ΔSPIN1 and wild-type P14 spermatogonia was performed using the NEBNext Enzymatic Methyl-seq (EM-seq, New England Biolabs) as described⁴. Analysis of DNA methylation was done as described previously⁴. Data for Spocd1^−/− and corresponding wild-type P14 spermatogonia were retrieved from E-MTAB-7997 (ref. ⁴).

Statistical information

Data were plotted in R (v.2022.07.01 and 554 running R v.4.0.3 (2020-10-10)) using the dplyr, ggplot2, tidyr, cowplot, reshape2, ggrepel, ggpubr, scales and RColorBrewer packages (versions dplyr_1.0.4, ggplot2_3.3.3, tidyr_1.1.2, cowplot_1.1.1, scales_1.1.1, reshape2_1.4.4, ggrepel_0.9.1, ggpubr_0.4.0, scales_1.1.1, RColorBrewer_1.1-2) or Microsoft Excel for Mac (v.16). Statistical testing was done with R (v.4.0.3 (2020-10-10)) using R Studio software or with Perseus⁵⁹ (v.1.6.5.0) for the mass-spectrometry data and DEseq2 (ref. ⁶⁰) for the RNA-seq data. We used the regioneR package⁵⁵ in R to perform permutation tests to assess the statistical significance of overlaps of CUT&Tag peaks with LINE1 elements. Unpaired, two-tailed Student’s t-tests were used to compare the differences between groups and adjusted for multiple testing using Bonferroni correction where indicated, except for RNA-seq data analysis, where Wald’s tests were used. Averaged data are presented as mean ± s.e.m., unless otherwise indicated. No statistical methods were used to predetermine the sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link