Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Similar presentations

Presentation on theme: "Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York."— Presentation transcript:

1 Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York

2 Physical vs. Biological Laws Physical Laws are often discovered by finding simple common explanation for very different phenomena Newton’s Law: Apples fall to the ground Planets revolve around the Sun Discovery of Biological Laws is slowed down by us having cookie-cutter explanation in terms of natural selection: 2

3 Drawing from Facebook group: Trust me, I'm a "Biologist"'

4 Genes encoded in bacterial genomes Packages installed on Linux computers 4 ~

5 Complex systems have many components Genes (Bacteria) Software packages (Linux OS) Components do not work alone: they need to be assembled to work In individual systems only a subset of components is installed Genome (Bacteria) – collection of genes Computer (Linux OS) – collection of software packages Components have vastly different frequencies of installation 5

6 Justin Pollard, 6 IKEA kits have many components

7 Justin Pollard, 7 They need to be assembled to work

8 Different frequencies of use vs CommonRare 8

9 What determines the frequency of installation/use of a gene/package? Popularity : AKA preferential attachment Frequency ~ self-amplifying popularity Relevant for social systems: WWW links, facebook friendships, scientific citations Functional role : Frequency ~ breadth or importance of the functional role Relevant for biological and technological systems where selection adjusts undeserved popularity 9

10 Empirical data on component frequencies Bacterial genomes ( 500 sequenced prokaryotic genomes 44,000 Orthologous Gene families Linux packages ( 200,000 Linux packages installed on 2,000,000 individual computers Binary tables: component is either present or not in a given system 10

11 Frequency distributions P(f)~ f -1.5 except the top √N “universal” components with f~1 11 Cloud Shell Core ORFans TY Pang, S. Maslov, PNAS (2013)

12 How to quantify functional importance? We want to check Frequency ~ Importance Usefulness=Importance ~ Component is needed for proper functioning of other components Dependency network A  B means A depends on B for its function Formalized for Linux software packages For metabolic enzymes given by upstream- downstream positions in pathways Frequency ~ dependency degree, K dep K dep = the total number of components that directly or indirectly depend on the selected one 12

13 13 TY Pang, S. Maslov, PNAS (2013)

14 Correlation coefficient ~0.4 for both Linux and genes Could be improved by using weighted dependency degree Frequency is positively correlated with functional importance 14 TY Pang, S. Maslov, PNAS (2013)

15 Warm-up: tree-like metabolic network 15 K dep =5 K dep =15 TCA cycle TY Pang, S. Maslov, PNAS (2013)

16 Dependency degree distribution on a critical branching tree P(K)~K -1.5 for a critical branching tree Paradox: K max -0.5 ~ 1/N  K max =N 2 >N Answer: parent tree size imposes a cutoff: there will be √N “core” nodes with K max =N present in almost all systems (ribosomal genes or core metabolic enzymes) Need a new model: in a tree D=1, while in real systems D~2>1 16

17 Bottom-down model of dependency network evolution Components added gradually over evolutionary time New component directly depends on D previously existing components selected randomly Versions: D is drawn from some distribution same as above Recent components are preferentially selected citations There is a fixed probability to connect to any previously existing components food webs 17

18 18 p(t,T) –probability that component added at time T directly or indirectly depends on one added at time t

19 19

20 K dep and K out degree distributions 20

21 K dep decreases layer number 21 Linux Model with D=2 TY Pang, S. Maslov, PNAS (2013)

22 Zipf plot for K dep distributions 22 Metabolic enzymes vs Model Linux vs Model TY Pang, S. Maslov, PNAS (2013)

23 Frequency distributions P(f)~ f -1.5 except the top √N “universal” components with f~1 23 Shell Core ORFans Cloud TY Pang, S. Maslov, PNAS (2013)

24 What experiments does P(f) help to interpret? 24

25 Pan-genome of E. coli strains M Touchon et al. PLoS Genetics (2009)

26 Metagenomes 26 The Human Microbiome Project Consortium, Nature (2012)

27 Pan-genome scaling 27

28 Pan-genome of all bacteria Slope=-0.4 predictions of the toolbox model (-0.5) P. Lapierre JP Gogarten TIG 2009 (# of genes in pan-genome) ~ (# of sequenced genomes) 0.5 (# of new genes added to pan-genome) ~ (# of sequenced genomes) -0.5  28

29 Bacterial genome evolution happens in cooperation with phages +=

30 Comparative genomics of E. coli implicates phages for BitTorrent Phage capacity: 20kb Other strains up to 40kb K-12 to B comparison 1kb: gene length

31 Phage-Bacteria Infection Network Data from Flores et al 2011 experiments by Moebus,Nattkemper,1981 WWW from AT&T website circa 1996 visualized by Mark Newman

32 Why eukaryotes run windows? Dependency network = reuse of components Bacteria do not keep redundant genes after HGT Linux developers rely on previous efforts Pros: smaller genomes, open source, economies of scale Cons: less specialized, potentially unstable, “dependency hell” Eukaryotes are like Windows or Mac OS X Keep redundant components Proprietary software 32

33 33 Figure adapted from S. Maslov, TY Pang, K. Sneppen, S. Krishna, PNAS (2009) # of genes # of pathways (or their regulators)

34 N selected packages ~ N installed packages 1.7 Software packages for Linux 34

35 35 Collaborators: Tin Yau Pang, Stony Brook University Support: Office of Biological and Environmental Research

36 Thank you!

Download ppt "Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York."

Similar presentations

Ads by Google