Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deep-Time Data Infrastructure: A DCO Legacy Program Robert M. Hazen—Geophysical Lab, Carnegie Institution DCO Data Science Day—RPI—June 5, 2014.

Similar presentations


Presentation on theme: "Deep-Time Data Infrastructure: A DCO Legacy Program Robert M. Hazen—Geophysical Lab, Carnegie Institution DCO Data Science Day—RPI—June 5, 2014."— Presentation transcript:

1 Deep-Time Data Infrastructure: A DCO Legacy Program Robert M. Hazen—Geophysical Lab, Carnegie Institution DCO Data Science Day—RPI—June 5, 2014

2 ConclusionsConclusions Vast, largely untapped, data resources inform our view of Earth’s dynamic history over 4.5 billion years. Combining those deep-time data resources into a single infrastructure represents an opportunity for accelerated “abductive” discovery.

3 Deep-Time Data Collaborators Carnegie Institution Robert Hazen Xiaoming Liu Anat Shahar Rutgers Paul Falkowski RPI Peter Fox Univ. of Arizona Robert Downs Mihei Ducea Grethe Hystad Barbara Lafuente Hexiong Yang Alex Pires Joaquin Ruiz Joshua Golden Melissa McMillan Shaunna Morrison CalTech Ralph Milliken Univ. of Maine Edward Grew Smithsonian Inst. Timothy McCoy Univ. of Manitoba Andrey Bekker MINDAT.ORG Jolyon Ralph Colorado State Holly Stein Aaron Zimmerman Univ. of Tennessee Linda Kah Univ College London Dominic Papineau George Mason Univ. Stephen Elmore Johns Hopkins Univ. Dimitri Sverjensky Charlene Estrada John Ferry Namhey Lee Harvard University Andrew Knoll Indiana University David Bish Univ. of Michigan Rodney Ewing Univ. of Maryland James Farquhar John Nance Univ. of Wisconsin John Valley Geol. Survey Canada Wouter Bleeker

4 Deep-Time Data Resources Mineralogy and petrology data: Mineral species and assemblages Compositions (including isotopes) Age (ages) Geographic location; tectonic setting Crystal size; morphology; twinning Solid and fluid inclusions; defects; Magnetic domains; zoning; exsolution Surface properties; grain boundaries

5 Mineralogy and petrology data Paleobiology data Fossil species and assemblages Age Biominerals; isotopic composition Molecular biomarkers Host lithology Geological/tectonic context Deep-Time Data Resources

6 Mineralogy and petrology data Paleobiology data Proteomics data Enzyme structure and function Age (from phylogenetics) Active site composition Microbial context Deep-Time Data Resources

7 Mineralogy and petrology data Paleobiology data Proteomics data Geochemistry data and modeling Thermochemical data Equilibrium and reaction path models Deep-Time Data Resources

8 Mineralogy and petrology data Paleobiology data Proteomics data Geochemistry data and modeling Paleotectonic & Paleomagnetic Data Age Deep-Time Data Resources

9 This is the IMA Mineral Database website, with a direct link to the Mineral Evolution Database.

10 This map displays the localities. The popup demonstrates metadata for a given locality.

11 The Premise: Rocks, minerals, fossils, and life’s biochemistry hold clues to significant changes in Earth’s near-surface environment through 4.5 billion years of history. The Potential of Deep-Time Data

12 The Rise of Atmospheric Oxygen Lyons et al. (2014) Nature 506, D.E.Canfield (2014) Oxygen. Princeton Univ. Press

13 The Rise of Atmospheric Oxygen Kump (2008) Nature 451, ?

14 The Rise of Atmospheric Oxygen D.E.Canfield (2014) Oxygen. Princeton Univ. Press. Lyons et al. (2014) Nature 506,

15 = Major metal element = Major non-metal element = Trace element The Rise of Oxygen: Evidence from redox-sensitive elements

16 log f O2 ~ -72 Geochemical modeling is key. The Rise of Subsurface Oxygen

17 Siderite FeCO 3 log f O2 < -68 The Rise of Subsurface Oxygen

18 Azurite & Malachite log f O2 > -43 The Rise of Subsurface Oxygen

19 Reaction path calculations reveal changes in mineralogy as fluids and rocks not in equilibrium react with each other. Data from Sverjensky et al. (in prep) The Rise of Subsurface Oxygen: Basalt weathering before/after the GOE

20 Reaction path calculations reveal changes in mineralogy as fluids and rocks not in equilibrium react with each other. Data from Sverjensky et al. (in prep) The Rise of Subsurface Oxygen: Basalt weathering before/after the GOE

21 What minerals won’t form before the Great Oxidation Event? 598 of 643 Cu minerals 202 of 220 U minerals 319 of 451 Mn minerals 47 of 56 Ni minerals 582 of 790 Fe minerals Piemontite Garnierite Xanthoxenite Chrysocolla

22 Co-evolution of the geosphere and biosphere Biologically mediated changes in Earth’s atmospheric composition at ~2.4 to 2.2 Ga represent the single most significant factor in Earth’s mineralogical diversity.

23 Enzymes reveal Earth’s geochemical history. Ferredoxin (before the GOE)

24 Nitrogenase (after the GOE) Enzymes reveal Earth’s geochemical history.

25 The Rise of Subsurface Oxygen

26 Golden et al. (2013), EPSL GOE HERE SE HERE The Rise of Subsurface Oxygen

27 Kump (2008) Nature 451, The Rise of Subsurface Oxygen

28 Hypothesis: There was a protracted “Great Subsurface Oxidation Interval” that postdated the GOE by a billion years. This interval was the single most significant factor in Earth’s mineralogical diversification. The Rise of Subsurface Oxygen

29 Most of what scientists do most of the time is start with a known phenomenon, and then collect relevant data and develop explanatory hypotheses. Data-Driven Discovery Data-Driven Discovery

30 Earth’s atmospheric oxidation influenced the partitioning of redox-sensitive elements. Mo, Re, Ni, and Co are redox-sensitive elements. Therefore, we deduce that atmospheric oxidation influenced the partitioning of Mo, Re, Ni, and Co. DeductionDeduction

31 RESULTS: Molybdenite (MoS 2 ) through Time GOE HERE SE HERE Golden et al. (2013) EPSL 366:1-5.

32 RESULTS: Cu/Ni in carbonates vs. time SE HERE GOE HERE Xiaoming Liu et al. (2013)

33 Each of the last 5 supercontinent cycles led to episodes of enhanced mineralization during intervals of continental convergence. Mo, Be, B, and Hg are mineral-forming elements. Therefore, we predict by induction that Mo, Be, B, and Hg minerals will display enhanced mineralization during intervals of continental convergence. InductionInduction

34 The Supercontinent Cycle

35 SUPERCONTINENTSTAGEINTERVALDURATION Kenorland (Superia)Assembly Stable Breakup Columbia (Nuna)Assembly Stable Breakup RodiniaAssembly Stable Breakup PannotiaAssembly Stable Breakup PangaeaAssembly Stable Breakup0.175-present 175

36 RESULTS: The Supercontinent CYCLE The distribution of zircon crystals through time correlates with the supercontinent cycle over the past 3 billion years. (Condie & Aster 2010; Hawksworth et al. 2010)

37 RESULTS: Mo Mineral Evolution Temporal distribution of molybdenite (MoS 2 ) Golden et al. (2013) EPSL 366:1-5.

38 Hg Mineral Evolution The distribution of mercury (Hg) minerals through time correlates with the SC cycle over the past 3 billion years, but there’s a gap during Rodinia asembly. Hazen et al. (2012) Amer. Mineral. 97:1013.

39 Abduction is a form of logical inference that goes from reliable data (i.e., observations), to a hypothesis that seeks to explain those data. (Paraphrased from Wikipedia) AbductionAbduction

40 Observations lead to new hypotheses. We have vast amounts of data on mineral species, compositions, isotopes, petrologic context, thermochemical parameters, tectonic settings, and the co-evolving biosphere through deep time. Previously unrecognized patterns and correlations will emerge from the integration and evaluation of those data. AbductionAbduction

41 THE CHALLENGE: Recognizing statistically meaningful patterns in large data resources: 1. Correlations among many variables Data-Driven Discovery

42 Large integrated data resources can be explored with multivariate techniques (i.e., principal component analysis). DATA-DRIVEN DISCOVERY Search for highly correlated patterns among linear combinations of many different variables.

43 THE CHALLENGE: Recognizing statistically meaningful patterns in large data resources: 2. Meaningful trends in data vs. time Data-Driven Discovery

44 RESULTS: Molybdenite (MoS 2 ) through Time Golden et al. (2013) EPSL 366: molybdenite samples

45 Analyze equal sized bins.Analyze equal sized bins. Apply statistical tests: linear regression of log Re content vs. time.Apply statistical tests: linear regression of log Re content vs. time. (Montgomery et al. 2006) Are these trends statistically significant?

46 THE CHALLENGE: Recognizing statistically meaningful patterns in large data resources: 3. Peak-to-noise problem Data-Driven Discovery

47 Peaks in ages of ~40,000 zircon crystals Condie & Aster (2010) Precambrian Research 180:

48 Monte Carlo Mean Kernal Density Analysis

49 THE CHALLENGE: Recognizing statistically meaningful patterns in large data resources: 4. Visualization opportunities Data-Driven Discovery

50 Element abundances versus numbers of mineral species (Hazen, Grew, Downs et al.) Why Do We See the Minerals We See? Too few species: Ga, Rb, Hf Too many species: As, Hg, Sb, U

51 Island area versus numbers of biological species (MacArthur and Wilson, 1967) Why Do We See the Minerals We See?

52 What percentage of minerals incorporating element X, also incorporates element Y? (Hazen, Fox, Downs et al.) Cobalt minerals that also incorporate arsenic Why Do We See the Minerals We See?

53 Frequency distributions of 4933 mineral species: 22% of mineral species are known from only one locality. Why Do We See the Minerals We See?

54 Frequency distributions of 4933 mineral species: 22% of mineral species are known from only one locality. Therefore: (1)Numerous additional minerals exist on Earth but as yet remain undescribed. (2) Numerous other plausible minerals do not now exist on Earth, but might have in the past, or might occur on other Earth-like planets. (3) If we “played the tape over again,” then the first 4933 minerals to be found would likely differ by ~1000 mineral species. Why Do We See the Minerals We See?

55 ConclusionsConclusions Vast, largely untapped, data resources inform our view of Earth’s dynamic history over 4.5 billion years. Combining those deep-time data resources into a single infrastructure represents an opportunity for accelerated “abductive” discovery.

56 CONCLUSIONS We are poised to make fundamental discoveries about our planetary home through development, integration, and exploration of deep-time data resources. Data-Driven Discovery

57 Please join this effort : Please join this effort : Archive your data Archive your data Release “dark data” Release “dark data” Help us build this resource Help us build this resource

58 Statistical tests: linear regression of log Re content vs. time (Montgomery et al. 2006): Log(Re) = β 0 +β 1 t+β 2 x 2 +β 3 x 3 +β 4 x 4 +β 5 x 5 +β 6 x 6 [t = time; β i = regression parameters; x i = indicator variables] β 0 =0; β 1 =0.0059(8); β 2 =4.6(7); β 3 =12(2); β 4 =15(2); β 5 =18(2); β 6 =19(2) Are these trends statistically significant?

59 Enzymes reveal Earth’s geochemical history. David & Alm (2011) “Rapid evolutionary innovation during an Archean genetic expansion.” Nature 469,93-96.


Download ppt "Deep-Time Data Infrastructure: A DCO Legacy Program Robert M. Hazen—Geophysical Lab, Carnegie Institution DCO Data Science Day—RPI—June 5, 2014."

Similar presentations


Ads by Google