open access publication

Preprint, 2024

Estimating gene conversion tract length and rate from PacBio HiFi data

bioRxiv, Page 2024.07.05.601865, 10.1101/2024.07.05.601865

Contributors

Charmouh, Anders Poulsen (Corresponding author) [1] Sørud, Peter Porsborg [1] Bataillon, Thomas Martin Jean 0000-0002-4730-2538 [1] Hobolth, Asger 0000-0003-4056-1286 [1] Hansen, Lasse Thorup [1] Besenbacher, Søren 0000-0003-1455-1738 [2] Winge, Sofia Boeg 0000-0003-1666-1228 [3] Almstrup, Kristian 0000-0002-1832-0307 [3] Schierup, Mikkel Heide 0000-0002-5028-1790 [1]

Affiliations

  1. [1] Aarhus University
  2. [NORA names: AU Aarhus University; University; Denmark; Europe, EU; Nordic; OECD];
  3. [2] Department of Molecular Medicine (MOMA), Brendstrupgårdsvej 21A, 8200 Aarhus N, Denmark
  4. [NORA names: Denmark; Europe, EU; Nordic; OECD];
  5. [3] Copenhagen University Hospital
  6. [NORA names: Capital Region of Denmark; Hospital; Denmark; Europe, EU; Nordic; OECD]

Abstract

Abstract Gene conversions are broadly defined as the transfer of genetic material from a ‘donor’ to an ‘acceptor’ sequence and can happen both in meiosis and mitosis. They are a subset of non-crossover events and like crossover events, gene conversion can generate new combinations of alleles, erode linkage disequilibrium, and even counteract the mutation load by reverting germline mutations through GC-biased gene conversion. Estimating the rate of gene conversion and the distribution of gene conversion tract lengths remains challenging. Here, we present a new method for estimating tract length, rate and detection probability of non-crossover events directly in HiFi PacBio long read data. The method can be applied with data from a single individual, is unbiased even under low single nucleotide variant densities and does not necessitate any demographic or evolutionary assumptions. We apply the method to gene conversion events observed directly in Pacbio HiFI read data from a human sperm sample and find that human gene conversion tracts are shorter (mean of 50 base pairs) than estimates from yeast or Drosophila . We also estimate that typical human male gametes undergo on average 280 non-crossover events where approximately 7 are expected to become visible as gene conversions moving variants from one donor haplotype to an acceptor haplotype.

Keywords

Drosophila, GC-biased gene conversion, HIFI, HiFi data, PacBio, PacBio HiFi, PacBio HiFi data, PacBio long-read data, acceptor, alleles, assumptions, combination, combinations of alleles, conversion, conversion events, conversion tract length, conversion tracts, crossover, crossover events, data, density, detection, detection probability, disequilibrium, distribution, donor, donor haplotype, estimation, events, evolutionary assumptions, gametes, gene conversion, gene conversion events, gene conversion tract lengths, gene conversion tracts, genes, genetic material, germline, germline mutations, haplotypes, human male gametes, human sperm samples, individuals, length, linkage, linkage disequilibrium, load, long-read data, male gametes, materials, meiosis, method, mitosis, mutation load, mutations, non-crossover events, rate, rate of gene conversion, samples, sperm samples, tract, tract length, transfer, transfer of genetic material, variant density, variants, yeast

Funders

  • Novo Nordisk Foundation

Data Provider: Digital Science