Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Protein Structure

Similar presentations


Presentation on theme: "Introduction to Protein Structure"— Presentation transcript:

1 Introduction to Protein Structure
Paolo Marcatili

2 Learning Objectives Outline the basic levels of protein structure.
Evaluate the quality of protein structures Navigate a protein structure using the program PyMOL. Generate publication-ready protein figures

3 Learning Objectives Predict protein features from its sequence
Build a model by homology and predict its accuracy Refine your model Predict the effects of mutations on stability and binding affinity

4 Activity - Predict the structure of a pathogenic protein
- Predict the role of mutations in antibody affinity

5 Levels of Protein Structure
Primary to Quaternary Structure

6 Amino Acids Proteins are built from amino acids
Amino group and acid group Side chain at Ca Chiral, only one enantiomer found in proteins (L-amino acids) 20 natural amino acids Ce Sd Cg Cb C Ca N O Methionine

7 The Amino Acids Thr (T) Phe (F) Val (V) Ala (A) His (H) Arg (R)
Ser (S) Leu (L) Cys (C) Met (M) Asp (D) Lys (K) Asn (N) Ile (I) Trp (W) Gln (Q) Glu (E) Tyr (Y) Pro (P) Gly (G)

8 How to Group Them? Many features Charge +/- Polarity (polar/non-polar)
Acidic vs. basic (pKa) Polarity (polar/non-polar) Type, distribution Size Length, weight, volume, surface area Type (Aromatic/aliphatic)

9 Grouping Amino Acids Acid deriv.

10 The Simple Aliphatic Val (V) Ala (A) Leu (L) Met (M) Ile (I)

11 The Small Polar Ser (S) Cys (C) Cysteine Thr (T) Cystine

12 The Unusual P, G Also aliphatic Structural impact
Strictly speaking, proline is an imino acid Gly (G) Pro (P)

13 The Acidic and Their Derivatives
Asp (D) Asn (N) Glu (E) Gln (Q)

14 The Basic Arg (R) Lys (K) His (H)

15 The Aromatic Phe (F) His (H) Trp (W) Tyr (Y)

16 Hypothesis i) Structure <=> Function
ii) A protein has a single stable structure iii) This structure is determined by the sequence alone iv) The protein can reach this fold autonomously (Anfinsen)

17 Hypothesis i) Structure <=> Function convergent and divergent evolution ii) A protein has a single stable structure Metastability, serpins, prions iii) This structure is determined by the sequence alone PTMs, epigenetics iv) The protein can reach this fold autonomously (Anfinsen) Enzymes, Chaperones, boiled eggs

18 Native State The proteins reach the conformation in which they minimize their Free Energy

19 Native State The proteins reach the conformation in which they minimize their Free Energy They just visit random conformations until they find the best one?

20 Native State The proteins reach the conformation in which they minimize their Free Energy They just visit random conformations until they find the best one? NO! it would take more than the age of universe (Levinthal's Paradox)

21 Folding Funnel

22 Proteins Are Polypeptides
The peptide bond A polypeptide chain

23 Ramachandran Plot Allowed backbone torsion angles in proteins N H

24 Structure Levels Primary structure = Sequence (of amino acids)
Secondary Structure = Helix, sheets/strands, bends, loops & turns (all defined by H-bond pattern in backbone) Structural Motif = Small, recurrent arrangement of secondary structure, e.g. Helix-loop-helix Beta hairpins EF hand (calcium binding motif) Many others… Tertiary structure = Arrangement of Secondary structure elements within one protein chain MSSVLLGHIKKLEMGHS…

25 Quaternary Structure Assembly of monomers/subunits into protein complex Backbone-backbone, backbone-side-chain & side-chain-side-chain interactions: Intramolecular vs. intermolecular contacts. For ligand binding side chains may or may not contribute. For the latter, mutations have little effect. Myoglobin Haemoglobin a a b

26 Hydrophobic Core Hydrophobic side chains go into the core of the molecule – but the main chain is highly polar. The polar groups (C=O and NH) are neutralized through formation of H-bonds. Myoglobin Surface Interior

27 Hydrophobic vs. Hydrophilic
Globular protein (in solution) Membrane protein (in membrane) Myoglobin Aquaporin

28 Hydrophobic vs. Hydrophilic
Globular protein (in solution) Membrane protein (in membrane) Cross-section Cross-section Myoglobin Aquaporin

29 Helix Types

30 b-Sheets Multiple strands  sheet Flexibility
Parallel vs. antiparallel Twist Flexibility Vs. helices Folding Structure propagation (amyloids) Other… Thioredoxin

31 b-Sheets Multiple strands  sheet Flexibility
Parallel vs. antiparallel Twist Flexibility Vs. helices Folding Structure propagation (amyloids) Other…

32 b-Sheets Multiple strands  sheet Flexibility
Parallel vs. antiparallel Twist Flexibility Vs. helices Folding Structure propagation (amyloids) Other…

33 b-Sheets Multiple strands  sheet Flexibility
Parallel vs. antiparallel Twist Flexibility Vs. helices Folding Structure propagation (amyloids) Other…

34 b-Sheets Multiple strands  sheet Strand interactions are non-local
Parallel vs. antiparallel Twist Strand interactions are non-local Flexibility Vs. helices Folding Antiparallel Parallel

35 Turns, Loops & Bends Between helices and sheets On protein surface
Intrinsically “unstructured” proteins

36 Summary The backbone of polypeptides form regular secondary structures. Helices, sheets, turns, bends & loops. These are the result of local as well as non-local interactions. Secondary structure elements are associated with specific residue patterns.

37 evolution, structure or function… ?
Domain Definition(s) From wikipedia protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. evolution, structure or function… ?

38 evolution, structure or function… ?
Domain Definition(s) From wikipedia protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. evolution, structure or function… ? Working definition: A structural domain is a compact, globular sub-structure with more interactions within it than with the rest of the protein.

39 SCOP & CATH Similar definitions, but
SCOP ~ human (good accuracy, less systematic, less data) CATH ~ automatic (more data, more systematic)

40 SCOP All alpha proteins (151) All beta proteins (111)
Alpha and beta proteins (a/b) (117) Mainly parallel beta sheets (beta-alpha-beta units) Alpha and beta proteins (a+b) (212) Mainly antiparallel beta sheets (segregated alpha and beta regions)

41 Fold: Major structural similarity
Proteins are defined as having a common fold if they have the same major secondary structures in the same arrangement and with the same topological connections. Different proteins with the same fold often have peripheral elements of secondary structure and turn regions that differ in size and conformation. In some cases, these differing peripheral regions may comprise half the structure. Proteins placed together in the same fold category may not have a common evolutionary origin: the structural similarities could arise just from the physics and chemistry of proteins favoring certain packing arrangements and chain topologies.

42 Superfamily: Probable common evolutionary origin
Proteins that have low sequence identities, but whose structural and functional features suggest that a common evolutionary origin is probable are placed together in superfamilies. For example, actin, the ATPase domain of the heat shock protein, and hexokinase together form a superfamily.

43 Family: Clear evolutionarily relationship
Proteins clustered together into families are clearly evolutionarily related. Generally, this means that pairwise residue identities between the proteins are 30% and greater. However, in some cases similar functions and structures provide definitive evidence of common descent in the absense of high sequence identity; for example, many globins form a family though some members have sequence identities of only 15%.

44 Some Folds

45 CATH hierarchy Class Architecture Topology Homologous superfamily

46 Pfam If we dont have structure, how can we know
domain arrangment or protein family of our target?

47 How to detect domains in a protein?
SCOP / CATH PFAM transmembrane regions Blocks in the MSA low-complexity regions

48 Protein Data Bank http://www.rcsb.org/ Contents File structure
Types of structures Structure reports & summaries Quality check Searching Molecule of the Month

49 PDB Growth

50 PDB File Fields COLUMNS DATA TYPE FIELD DEFINITION – 6 Record name "ATOM" Integer serial Atom serial number. 13 – 16 Atom name Atom name. 17 Character altLoc Alternate location indicator. 18 – 20 Residue name resName Residue name. 22 Character chainid Chain identifier. 23 – 26 Integer resSeq Residue sequence number. 27 AChar iCode Code for insertion of residues. 31 – 38 Real(8.3) x Orthogonal coordinates for X in Angstroms 39 – 46 Real(8.3) y Orthogonal coordinates for Y in Angstroms 47 – 54 Real(8.3) z Orthogonal coordinates for Z in Angstroms 55 – 60 Real(6.2) occupancy Occupancy. 61 – 66 Real(6.2) tempFactor Temperature factor. 77 – 78 LString(2) element Element symbol, right-justified. 79 – 80 LString(2) charge Charge on the atom.


Download ppt "Introduction to Protein Structure"

Similar presentations


Ads by Google