Project P3: Classifying and understanding Y-STR deletions and duplications in YHRD


Mark A Jobling, Jon H Wetton, Department of Genetics & Genome Biology, University of Leicester, UK


Sascha Willuweit and Lutz Roewer, Charité – Universitätsmedizin Berlin, Institute of Legal Medicine, Forensic Genetics Dept., Berlin, Germany


The human Y chromosome is particularly prone to segmental deletions and duplications [1, 2]. Some of these are sporadic, while others are recurrent and driven by non-allelic homologous recombination – examples are short-arm deletions sponsored by TSPY repeat recombination that remove the AmelY gene, or long-arm deletions and duplications of the AZFa, b and c regions.

Deletions and duplications have been detected in a number of different ways, including targeted analysis in infertile males and comparative genomic hybridization in studies of genome-wide copy-number variation. The application of Y-STRs in forensic and population genetics has also revealed deletions through null alleles, and duplications through peak height variation or additional alleles. In the forensic setting, understanding these phenomena is important because Y-STR haplotypes carrying such alleles could be misinterpreted as partial or mixed. Deletions of some STRs have been studied already – examples are DYS458 [3], DYS19 [4, 5] and DYS448 [6]. However, a systematic survey based on commonly typed Y-STRs has not yet been undertaken.

YHRD is the largest collection of Y-STR haplotype data in existence, with over 300,000 profiles from globally distributed populations. Every one of the 27 STR loci within the maximal haplotype shows examples in the database of both null and additional alleles. YHRD therefore presents an excellent opportunity for systematically analysing and understanding the characteristics of deletions and duplications.

Aims of the project

We aim to:

  • Catalogue all haplotypes containing null and duplicated alleles;
  • Group haplotypes by which STRs are deleted or duplicated;
  • Consider co-deleted or co-duplicated alleles within the framework of the Y chromosome reference sequence, or alternative sequence arrangements;
  • Classify deletions and duplications against the catalogue of rearrangements available from whole-genome sequence data (e.g. 1000 Genomes Project);
  • Use Y-SNP data, where available, and otherwise haplogroup prediction, to define haplogroups associated with deletions and duplications;
  • Delineate deletion or duplication sub-clusters within haplogroups, by considering all STR haplotypes;
  • Describe the population distribution of deletion and duplication lineages;
  • Publish the results of the above in a peer-reviewed journal.

We believe that this analysis will be of benefit to the forensic community by allowing a rational interpretation of haplotypes containing deleted and duplicated alleles, set within the relevant population framework. It will also add considerably to what is known about sporadic and recurrent Y structural variation.

Data to be used from YHRD

We require all haplotypes and their haplogroup containing null or duplicated alleles, which we expect to come from a wide range of YHRD submissions. Where the data to be used are unpublished, we will contact submitters and ask for their collaboration, and to offer acknowledgement or coauthorship, as appropriate, in resulting publications.

Date of objection

Expired at August, 24th, 2020


Mark A Jobling; Jon H Wetton


[1] M.A. Jobling, Copy number variation on the human Y chromosome, Cytogenet. Genome Res. 123 (2008) 253-262.

[2] A. Massaia, Y. Xue, Human Y chromosome copy number variation in the next generation sequencing era and beyond, Hum Genet 136 (2017) 591-603.

[3] M.A. Jobling, I.C. Lo, D.J. Turner, G.R. Bowden, A.C. Lee, Y. Xue, D. Carvalho-Silva, M.E. Hurles, S.M. Adams, Y.M. Chang, T. Kraaijenbrink, J. Henke, G. Guanti, B. McKeown, R.A. van Oorschot, R.J. Mitchell, P. de Knijff, C. Tyler-Smith, E.J. Parkin, Structural variation on the short arm of the human Y chromosome: recurrent multigene deletions encompassing Amelogenin Y, Hum. Mol. Genet. 16 (2007) 307-316.

[4] C. Capelli, F. Brisighelli, F. Scarnicci, A. Blanco-Verea, M. Brion, V. Pascali, Phylogenetic evidence for multiple independent duplication events at the DYS19 locus, Forensic. Sci. Int. Genet. 1 (2007) 287-290.

[5] P. Balaresque, E.J. Parkin, L. Roewer, D.R. Carvalho-Silva, R.J. Mitchell, R.A.H. van Oorschot, J. Henke, M. Stoneking, I. Nasidze, J. Wetton, P. de Knijff, C. Tyler-Smith, M.A. Jobling, Genomic complexity of the Y-STR DYS19: inversions, deletions and founder lineages carrying duplications, Int. J. Legal. Med. 123 (2008) 15-23.

[6] P. Balaresque, G.R. Bowden, E.J. Parkin, G.A. Omran, E. Heyer, L. Quintana-Murci, L. Roewer, M. Stoneking, I. Nasidze, D.R. Carvalho-Silva, C. Tyler-Smith, P. de Knijff, M.A. Jobling, Dynamic nature of the proximal AZFc region of the human Y chromosome: multiple independent deletion and duplication events revealed by microsatellite analysis, Hum. Mutat. 29 (2008) 1171-1180.

* See FAQ/Glossary ( for further explanations of abbreviated terms used here