Non-canonical DNA in bird telomere-to-telomere genomes
- crwdns4820:0crwdne4820:0
- crwdns4822:0crwdne4822:0
- bioRxiv
- DOI
- 10.1101/2025.10.17.683159
Summary
Non-canonical (non-B) DNA motifs are genomic sequences capable of folding into three-dimensional structures distinct from the canonical right-handed helix. These structures regulate gene expression but also serve as mutation hotspots and are linked to cancer. Because non-B DNA is difficult to sequence, its annotations have been incomplete in most genome assemblies. Telomere-to-telomere (T2T) assemblies now overcome this limitation. Here, we provide a comprehensive analysis of eight types of non-B DNA motifs (e.g., G-quadruplexes and Z-DNA) in the zebra finch T2T genome. Motif content varied strongly by chromosome categories; gene-rich dot chromosomes showed the highest motif levels (22.8-40.5%), microchromosomes intermediate levels (9.8-24.8%), and macrochromosomes the lowest (9.1-10.1%). Within chromosomes, Z-DNA was enriched at centromeres, and G-quadruplexes were enriched at promoters and 5’UTRs. Low methylation at G-quadruplexes suggests they can form and contribute to gene regulation in these regions. Comparable patterns of non-B DNA distribution were observed in the near T2T chicken genome, except that A-phased repeats and not Z-DNA were enriched at chicken centromeres. Overall, our findings indicate that the non-B DNA distribution reflects the distinctive architecture of avian genomes, implicating non-canonical DNA in gene expression and centromere organization. The unusually high density on dot chromosomes is negatively correlated with PacBio sequencing depth, and thus helps explain why these chromosomes have posed exceptional challenges for sequencing.
Highlights
-
We present the first analysis of sequences with the potential to adopt non-canonical (non-B) DNA conformations within a telomere-to-telomere (T2T) assembly of a bird genome, the zebra finch, and compare it to that in the near T2T assembly of chicken.
-
Non-B DNA, particularly G-quadruplexes, is markedly enriched at regulatory regions such as promoters and 5’UTRs, suggesting its role in regulating gene expression in bird genomes.
-
Z-DNA shows strong enrichment at centromeric regions, implying a contribution to centromere architecture and function in the zebra finch.
-
The short, gene-rich, and highly recombining dot chromosomes have a strong overrepresentation of non-B DNA, which may act as a tunable regulator of euchromatin activity, but is also correlated with low sequencing depth.