Presentation on theme: "HUGO Gene Nomenclature Committee (HGNC), Department of Biology, University College London, Wolfson House, 4 Stephenson Way, London NW1 2HE, UK. The work."— Presentation transcript:
HUGO Gene Nomenclature Committee (HGNC), Department of Biology, University College London, Wolfson House, 4 Stephenson Way, London NW1 2HE, UK. The work of the HGNC is supported by NHGRI grant P41 HG003345, the UK Medical Research Council and the Wellcome Trust Email: email@example.com URL: http://www.gene.ucl.ac.uk/nomenclature/ Accounting for Copy Number Variation: A Hierarchical Database Structure Sneddon TP, Lush MJ, Wright MW, Sneddon KMB, Povey S and Bruford EA
Introduction The HGNC has to date approved over 24,000 unique symbols and names, the majority of which are for ‘genes’, i.e. genomic segments that are transcribed and translated into functional proteins. However, an increasing number of genes, initially thought to be single copy in the human genome, are turning out to be copy number variant (CNV) between individuals 1-4. This is predominantly the case for genes encoding secreted, olfactory and immunity related proteins, like the well-established amylase and defensin gene families 5. Following community discussions of CNV nomenclature at the American Society of Human Genetics meeting 2005 and the joint HGNC and HGVS (Human Genome Variation Society) satellite meeting at HGM2006 it was agreed that a method of naming CNV genes was required and several suggestions were made for how this could be achieved. Based on these suggestions the HGNC has now implemented a hierarchical database structure to capture and represent information concerning genomic variation such as copy number variant genes.
The defensin beta (DEFB) copy number variant genes An example of copy number variation is illustrated by the >300 kb segmentally duplicated region on 8p23.1 that contains the DEFB4, DEFB103-DEFB107, SPAG11, and DEFB109 genes. UCSC Genome Browser on Human Mar. 2006 Assembly > 300 kb segmentally duplicated region As shown above, the current genome build includes two copies of the segmental duplication at >96% nucleotide identity, in opposite orientations, either side of a gap. This region has been shown to be copy number variant and present in 2-12 copies per diploid genome 9-11.
Searchgenes results for ‘DEFB103’ As an example of our hierarchical database structure the result from searching for one of the copy number variant defensin genes, DEFB103, using Searchgenes 12 is shown below. By default only the DEFB103 gene record is returned. An option will be provided to display all variants. Link to DEFB103 gene record (Fig. 1)
The DEFB103 gene record The DEFB103 gene record shown below combines sequence and gene information for both the DEFB103A and DEFB103B copy number variant genes. The sub-entry field links to the individual DEFB103A (chr8: 7.8 Mb - Fig. 2) and DEFB103B (chr8: 7.3 Mb) gene records. There is also a link to the DEFB103 search result in the Database of Genomic Variants 6 (Fig. 3). Link to Database of Genomic Variants 6 Link to DEFB103A and DEFB103B CNV sub- entries Fig. 1
The DEFB103A copy number variant gene record The DEFB103A copy number variant sub-entry gene record shown below lists sequence and gene information for the defensin, beta 103 gene located on chr 8: 78 Mb. There is also a link to the DEFB103 search result in the Database of Genomic Variants 6 (Fig. 3). Link to Database of Genomic Variants 6 Fig. 2 Link back to DEFB103 gene record
Fig. 3: Partial screenshots to show some of the information available from the Database of Genomic Variants 6 on Human Genome Assembly Build 36 for the DEFB103 gene (circled in red).
The CNV gene is published (or to be submitted) and is listed in the Database of Genomic Variants 6. If your gene is not already listed please submit your data to the Database of Genomic Variants 6 before contacting us. NCBI 7 and VEGA 8 (if annotated by VEGA 8 ) agree upon the co- ordinates for each of the CNV copies in a reference sequence. This can include alternate assemblies (e.g. based on Celera assembly) and/or haplotypes (e.g. c5_H2 and c22_H2). Criteria for naming copy number variant genes HGNC will provide a gene symbol for a copy number variant gene upon request when the following criteria are reached: These points will be added to our official Guidelines 13 and our hierarchical database structure will be public once populated with >100 copy number variants.
Summary If you have a copy number variant gene submission please complete our online gene symbol request form, specifying the CNV status in the additional comments and information field: http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/request.pl Please visit us at Booth 509 in the Exhibition Hall or email firstname.lastname@example.org to discuss your views References 1. Iafrate AJ et al. (2004) Nat. Genet. 36(9):949-51. 2. Sebat J et al. (2004) Science. 305(5683):525-8. 3. Sharp AJ et al. (2005) Am. J. Hum. Genet. 77(1):78-88. 4. Redon R et. al. (2006) Nature. 444(7118):444-54. 5. Nguyen D-C, Webber C and Ponting CP (2006) PLoS Genet. 2(2):198-207. 6. http://projects.tcag.ca/variation/ 7. http://www.ncbi.nlm.nih.gov 8. http://vega.sanger.ac.uk/ 9. Hollox EJ, Armour JA and Barber JC (2003) Am. J. Hum. Genet. 73(3):591-600. 10. Taudien S et al. (2004) BMC Genomics. 5(1):92. 11. Linzmeier RM and Ganz T (2005) Genomics. 86(4):423-30. 12. http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/searchgenes.pl 13. http://www.gene.ucl.ac.uk/nomenclature/guidelines.html