25 April 2006 HPI meeting The chimp and us ID _PANTR
25 April 2006 HPI meeting The chimp and us ID _PANPA
Complete chimp (PANTR) genome publication: Nature, sept 2005 - Genome derived from one individual ‘Clint’ (male from west Africa) - Inter vs intra (polymorphism) species differences !!! - Individual human genome variation: 1bp/1000 - Individual chimp genome variation: 1bp/250 (estimation Varki (2000)
25 April 2006 HPI meeting The chimp and us Cheeta has been recognized by the Guinness Book of World Records as the world's oldest chimp. Chimps rarely live past the age of 40 in the wild, but can reach 60 in captivity.
- The chimpanzee genome was sequenced to approximately four-fold coverage (error rate < 10 -4 ) - WGS sequencing approach (-> problem for the assembly of region with segmental duplication): ~22.5 millions of sequence reads to assemble. 2 assembly approaches (PCAP* and ARACHNE) - In one* of the 2 approaches, contigs were assembled using the human genome as a guide "humanized" in their construction. some sequences, such as insertions, deletions, and gene duplications, may not be accurately represented by the current chimpanzee assembly.
-NCBI has adopted the NEW chimpanzee chromosome naming system as proposed by McConkey, 2004 McConkey, 2004 - The UCSC-Genome browser currently uses the original chimpanzee chromosome naming system.UCSC-Genome browser
Humanness - Bipedalism - Large cranial capacity (Brain size) - Advance brain development (langage capability) - A long generation time - and some other ‘biomedical’ differences….
Olson et al., (2002) Chimps expressed apoE4 allele Chimps: no acne, rhinitis but no asthma, no rheumatoid arthritis
-The last common acestor of humans and chimp is believed to have walked on 4 legs. -The oldest fossils that resemble bipedal human are 6 to 7 millions years old. - DNA sequence analyses suggest the 2 lineages separated about 5.4 millions years ago.
Short time since human-chimp split: it is likely that a few mutations of large effects are responsible for part of the differences. Comparative genomic analysis Human vs mouse, chick…: focus on similarities Human vs chimp: focus on differences
Hypothesis to account for the evolution of humanness traits
Quantifying the sequence divergence: Single nucleotide subtitutions: 1.23% (1, 78% for chromosome Y) (0.8 % in protein coding region) Indels: ~1.5 % Transposable elements: 3 % Recent duplication of DNA segments: 2.7 % ~ 35 mo nucleotides differences ~ 5 mo indels Many chromosomal rearrangements Human: 3.4 10 9 bp; Chimp: 3.6 10 9 bp
25 April 2006 HPI meeting The chimp and us ~ 35 mo nucleotides differences ‘Since we apparently diverged from a common ancestor 6 million years ago, that is roughly 6 mutations per year that get fixed within the genome (or 3 per year if you divide them equally amongst the 2 branching species). Given a conservative estimate of average generational time of 10 years, this means that 30 new mutations had to be fixed within the population every generation. The current human mutation rate is around 3 or 4 mutations per organism.’ http://www.uncommondescent.com/index.php/archives/875
A genome-wide survey of structural variation between human and chimpanzee (Newman et al., (2005)) - Approach: Mapping chimp fosmid against human reference sequence and identifying discordant regions by size and orientation - Limitations: The human genome is not complete The chimp genome = 1 individual (! Inter/intraspecific differences !)
-Identification of 651 regions of putative structural variation between the human genome assembly and the single chimp individual (293 chimp deletions, 184 chimp insertions and 174 inversions/duplicative transpositions). - Chromosome Y is the most rearranged chromosome between human and chimp (! Repetitive regions !) - They have identified 245 (RefSeq) genes that may be affected by the structural differences between chimp and human (drug detoxification, receptors, reproduction) (Newman et al., (2005))
At the genome level 1)Structural variations 2) Segmental duplication
Segmental duplication (impact: 2.7 %) Longer than 20 kilobases (-> 300 kb), greater than 94 % sequence identity - 33% of human duplicated segments are human specific - 17 % of chimp duplication are chimp specific. Half of the genes in the human specific duplicated regions exhibit significant differences in gene expression relative to chimp and are most often upregulated. Cheng et al., Nature (2005)
About 300 region were identified where the human genome showed significant increase in copy number when compared to chimp. ‘Only’ 92 regions where the chimp genome showed an increase in copy number compared to human (but with higher rate of duplication) Cheng et al., Nature (2005)
Example: 4 human regions represented ~ 400 x in chimp genome (99.2% identity) Cheng et al., Nature (2005)
At the genome level 1)Structural variations 2) Segmental duplication 3) Interspersed/Transposable repeats
-The human genome is composed of ~ 45 % of interspersed elements Including: Long interspersed elements (LINEs); these encode a reverse transcriptase Short interspersed elements (SINEs); these include Alu repeats -The human genome contains about 1,000,000 Alu elements. - Found only in primates.
~11’000 ‘recent’ transposons copies that are differentially present in human/chimp: 73 % found in human and 27 % found in chimp - Particular interest in recently mobilized transposons - The transposons that inserted into human or chimp genome during the passed 6 mo years would be expected to be present in only one of the 2 genome. Interspersed/Transposable element insertions (impact 3 %) - endogenous mutagens which can alter genes, promote genomic rearrangements… - may help to drive the speciation of organisms
Mills et al., Am. J. Hum. Genet., 78:671-679, 2006 Interspersed/Transposable element insertions Endogenous retrovirus
Interspersed /Transposable element insertions - Alu, L1 and SVA insertions accounted for > 95% of the insertion in both species - Human and chimp have amplified different subfamilies of these elements. SVA: composite element (1.5-2.5 kb) (2 Alu, a tandem repeat and a region derived form HERV-k)
Human have supported higher levels of transposition than chimp during the past several million years (but…not the case for the baboo which shows an activity 1.6 fold higher than human -> general decline in Alu activity in chimp)
Blat human DNA vs chimp DNA AJ271736 Xq pseudoautosomal
- 34 % of the insertions were located within known genes during the evolution of human and chimp Interspersed /Transposable element insertions
- The original set of transposons in the common ancestor of human and chimp behaved differently during the subsequent evolution of the 2 organisms - Human received at least 4’800 additional transposon insertions compared to chimp -> impact of transposon mutagenesis is likely to be greatest in human during the past several million years. - Human and chimp have amplified different subfamilies of these elements. - Factors such as differences in population size may also have influence the pattern of transposon insertion. Interspersed /Transposable element insertions - conclusions
Nucleotide divergence: 1.23 % 14-22 % of these differences are due to polymorphism -> fixed divergence rate = ~1.06 % Chromosome X: ~0.94 % Chromosome Y: ~1.9 % Higher mutation rate in the male compared with female germ line (higher number of cell division (5 to 6 fold))
At the gene level: 13’454 pair of orthologous genes (507 Swiss-Prot, 1134 TrEMBL: 1641) (NCBI: 3111) - 29 % are 100 % identical - 5% with in-frame indel (mainly in repetitive region)
A classical measure of the overall evolutionary constraint on a gene K A : non-synonymous substitution rate in coding sequence K B or K s synonymous substitution rate in coding sequence K l : substitution rate in non-coding sequence K A /K B << 1: typical of most proteins where change is detrimental (negative selection) K A /K B > 1: for the rare protein for which it is a positive selection
About 500 genes with a K A /K B > 1 Most of the genes with a K A /K B > 1 are not involved in process related to supposed humanness. Genes with highest K A /K B ratio are mostly related to host- pathogen interaction, immunity and reproduction (pattern also found in other mammals (cf Valeria’s work on human/mouse orthologs)
In fact genes related to brain function and neuronal activities show lower-than-average K A /K B ratio - Neural genes, as a group, have much lower average of K A /K B ratio than genes expressed outside of the brain. Hypothesis: only a small subset of genes may be the target of positive selection: not visible in such type of studies. (Hill, Walsh (2005))
Example 1: FOXP2 - gene relevant for the human ability to develop language - among the 5% most conserved protein -CC -!- DISEASE: Defects in FOXP2 are the cause of speech-language -CC disorder 1 (SPCH1) [MIM:602081]; also known as autosomal dominant -CC speech and language disorder with orofacial dyspraxia. Affected -CC individuals have a severe impairment in the selection and -CC sequencing of fine orofacial movements, which are necessary for -CC articulation. They also show deficits in several facets of -CC language processing (such as the ability to break up words into -CC their constituent phonemes) and grammatical skills.
- Extremely conserved among mammals - Acquired 2 aa changes in the human lineage (T303N and N325S), including one potential/functional phosphorylation site (N325S) -Estimation: fixation of these mutations occurs during the last 200’000 years of human history, concomitant with of subsequent to the emergence of anatomically modern humans. Enard et al., Nature (2002)
BUT: - no aa substitution are shared between song-learning birds, vocal learning whales, dolphins and bats, and human, … AND… - during times of song plasticity, FoxP2 is upregulated in a striatal region esssential for song learning. - selection acted on large non-coding regulatory regions of FoxP2 ??? - duplication of the chromosomal region (27 genes including FoxP2) may be another cause of speech and language disturbance ???
Less-is-more hypothesis Loss of function changes (lack of body hair, preservation of juvenile traits, expansion of the cranium) could be caused by non-synonymous substitutions, indels, loss of coding regions and deletions of entire genes. -> 53 human genes with disruptive indels in the coding regions (compared to chimp)
Well documented examples of human specific pseudogenization - MYH16, CMAH, CASP12, ELN, T2R62P (bitter taste receptor), MBL1 - Microcephalin (MCPH1) Challenge: dating the event !
MYH16 Myosin gene mutation (MYH16) correlates with anatomical changes in the human lineage inactivated by a frameshifting mutation after the lineages leading to humans and chimpanzees diverged (~2.4 Myr). The gene is transcribed (-> the coding sequence deletion was not preceeded by a mutation in a transcriptional control domain). Expressed only in masticatory muscles in other mammals. Loss of this protein isoform is associated with marked size reductions in individual muscle fibres and entire masticatory muscles. Nature 428, 415-418 (2004)
Phylogenetic reconstruction for all human sarcomeric myosin genes (heavy chain), showing early divergence of MYH16 from others. Nature 428, 415-418 (2004)
Aligned DNA sequences for MYH16 exon 18 representing seven non-human primate species and six geographically dispersed human populations, revealing the effect of frameshift on reading frame and deduced amino acid sequence. Note stop codon at position 72−74. Nature 428, 415-418 (2004)
The findings on the age of the inactivating mutation in the MYH16 gene raise the intriguing possibility that the decrement in masticatory muscle size removed an evolutionary constraint on encephalization, as suggested by the anatomy of the muscle attachments relative to the sutures -> marked increase in cranial capacity. Nature 428, 415-418 (2004)
But: Human encephalization -> obstetric constraints associated with pelvic dimensions for bipedality Importance of the genes that control the development of brain size in mammals: ASPM, MCPH1, CASP3 … Have undergone accelerated rates of protein evolution Strong positive selection at several loci McCollum et al., (2006), J. of Human Evolution, 50, 232-236
MCPH locus (microcephalin) MCPH1 -> MCPH6 MCPH5/ASPM locus (abnormal spindle microcephaly) - High K A /K B ratio - Patients with loss-of-function in microcephalin have cranial capacities about 4 SD below the mean at birth and ~1/3 of the size as adult. - May control the proliferation and/or differenciation of neuroblasts during neurogenesis. - Continues its trend to adaptive evolution - Ex: APSM acquires an advantageous aa change every 350’000 years. Evans et al., Nature 2005
CMAH Alu-mediated sequence replacement -> inactivation of the enzyme CMP-N_acetylneuraminic acid hydrolase in human This mutation occurred after our last common ancestor with bonobos and chimpanzees, and before the origin of present-day humans (~2.8 mya ) -> susceptibility or resistance to certain microbial pathogens (host receptors). Chou et al., 2005 Pseudogenization
CASP12 Functional gene in all mammals except human. Mediator of apoptosis in response to perturbed calcium homeostasis -> loss of this gene in mice increases resistance in to amyloid-induced neuronal apoptosis (-> Alzheimer in human ?) -> loss of this gene seems also to confer resistance to severe sepsis Pseudogenization
25 April 2006 HPI meeting The chimp and us Last but not least…
Comparative analysis of cancer genes in the human and chimpanzee genomes - The incidence of cancer in non-human primates is very low. -All examined human cancer genes (n=333) are present in chimpanzee, contain intact open reading frames and show a high degree of conservation between both species (99.38%) - Sequencing of the BRCA1 gene has shown an 8 Kb deletion in the chimpanzee sequence that prematurely truncates the co-regulated NBR2 gene. Puente et al., (2006) Ex: Pro-72 is polymorphic only in human Blat of P53_Human vs chimp
Transcriptome evolution (and epigenetics events) Changes in gene usage may be a primary contributor to the differences in chimp and human brains. 10% of all genes expressed in the brain differ in their expression levels between humans and chimps…but no causative connection found… Several studies, but none with convincing results Heissig et al., 2005 Khaitovich et al., 2005
Existence of orthologs DEF7_PANTR MISCELLANEOUSMISCELLANEOUS: The human orthologous protein seems not to exist, its coding region does not have a start codon. Pseudogenization (when documented) T2R64_PANPA MISCELLANEOUSMISCELLANEOUS: The human and chimpanzee orthologous proteins do not exist, their genes are pseudogenes. Swiss-Prot annotation