CIBIO/InBIOIICT Miguel Porto, Pedro Beja, Rui Figueira.

Slides:



Advertisements
Similar presentations
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Advertisements

Jennifer A. Dunne Santa Fe Institute Pacific Ecoinformatics & Computational Ecology Lab Rich William, Neo Martinez, et al. Challenges.
Sensemaking and Ground Truth Ontology Development Chinua Umoja William M. Pottenger Jason Perry Christopher Janneck.
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Spatial Statistics Applied to point data.
Fundamentals of Information Systems, Fifth Edition
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Databases and Database Management Systems
The Tools of Geography FrancisciWG.1. Remember: Geography is the science that studies the lands, the features, the inhabitants and the phenomena of the.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
Flexible Text Mining using Interactive Information Extraction David Milward
Research programmes in ecology Jacques Baudry 1, Françoise Burel 2, and Agnès Ricroch 3 1 INRA of Rennes, 2 University of Rennes/CNRS, 3 University of.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Macroecology & Conservation Unit
PCB 3043L - General Ecology Data Analysis.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
1.Define a landscape. What is the focus of Landscape Ecology. Notes 2. Discuss the role of spatial and temporal scale in affecting landscape composition,
Using Bayesian Networks to Predict Plankton Production from Satellite Data By: Rob Curtis, Richard Fenn, Damon Oberholster Supervisors: Anet Potgieter,
INTRODUCTION Use of DNA data in determining phylogenetic relationships is well established. DNA barcode approach to use.
The Problem of Pattern and Scale in Ecology - Summary What did this paper do that made it a citation classic? 1.It summarized a large body of work on spatial.
7. Air Quality Modeling Laboratory: individual processes Field: system observations Numerical Models: Enable description of complex, interacting, often.
Sample-based data publication; reflections on semantics and logic 1(1) Hanna - GBIF Finland Lepidoptera collection of Hannu SaarenmaaPublicNo (but DwC.
CYBER-GIS FOR SCIENTIFIC DISCOVERIES. Global Forest Change Hansen, M. C. et al (2013). High-Resolution Global Maps of 21st-Century Forest Cover Change.
There is an inherent meaning in everything. “Signs for people who can see.”
© 2017 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Prediction and Missing Data. Summarising Distributions ● Models are often large and complex ● Often only interested in some parameters – e.g. not so interested.
1 The Avian Knowledge Network: Decision Support System for Adaptive Management Leo Salas & Grant Ballard – California Avian Data Center, PRBO Conservation.
TECHNOLOGY IN ACTION. Chapter 11 Behind the Scenes: Databases and Information Systems.
BSA 385 Week 3 Individual Assignment Frequent Shopper Program Part 2 Check this A+ tutorial guideline at
Popular Database Management Systems
What is cognitive psychology?
Discovering and accessing data from a distributed network of data centres S. Mazzeo (ESA)
COGNITIVE APPROACH TO ROBOT SPATIAL MAPPING
Traits for species in WoRMS EMODNET WP2.2
Datab ase Systems Week 1 by Zohaib Jan.
Cloud based linked data platform for Structural Engineering Experiment
Join the 2018 science trip to Costa Rica
A Rapid Data Assessment for the Species Status Assessment
Community interactions are classified by whether they help, harm, or have no effect on the species involved Ecologists call relationships between species.
Semantic Visualization
DSS & Warehousing Systems
Flanders Marine Institute (VLIZ)
PCB 3043L - General Ecology Data Analysis.
Expanding and Scaling Lifemapper Computations Using CCTools
Presenter Organisation(s)
Introduction C.Eng 714 Spring 2010.
Meng Lu and Edzer Pebesma
EC FP7 - Cooperation Theme 6: Environment (incl. climate change)
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
What is development? Domains of development
Bringing Organism Observations Into Bioinformatics Networks
Conceptual Frameworks, Models, and Theories
Presenter Organisation(s)
Four Levels of Data from Ricardo’s Database Illuminated
Measuring Data Quality and Compilation of Metadata
Data Warehousing and Data Mining
Geography & Technology
An ecosystem of contributions
Introduction to USA-NPN and Nature’s Notebook
Community interactions are classified by whether they help, harm, or have no effect on the species involved Ecologists call relationships between species.
Why are Spatial Data Special?
USA-NPN National Coordinating Office
7.b Marine alien species on EASIN
Biological Science Applications in Agriculture
Journal Assignments Course project developed by Andrea Bierema, Michigan State University. Materials accessed from
Views Base Relation View
Presentation transcript:

CIBIO/InBIOIICT Miguel Porto, Pedro Beja, Rui Figueira

 Ecological systems are highly intricate networks: every species may be related, in different ways, to every other species  Much of the knowledge on ecological networks is at the conceptual level, not at the factual level changes in the abundance of any one species may affect, directly or indirectly, N other species

 Biodiversity data has traditionally been based on species occurrences  Biodiversity databases, as a norm, fail to document relations  Ecological relations, like species, have a spatial and temporal dimension, i.e. they occur

Why not store relationship occurrences rather than species occurrences? (the former includes the latter, anyway) Cytinus ruber parasitizing rockrose at ºN ºW it might look like a detail but it makes a huge difference in the amount of fundamental ecological information that is recorded

 An infrastructure to store and manage occurrences of ecological relations that:  is connected bidirectionally to existing species occurrence databases  strictly follows the data standards for existing types of data (e.g. DarwinCore for species occurrence data)  proposes new standards for describing ecological/ biological relationship data  provides an array of relationship-based web services to allow interoperability with existing platforms

 Raw data: published relationship data comes in an immense array of formats and with varying levels of detail and aggregation ▪ Orobanche gracilis parasitizing Retama sphaerocarpa ▪ Blackbird feeding on the fruits of ivy in Portugal ▪ Fish of genus Barbus feeding on filamentous algae in Tagus river, in Spring ▪ Beetle pollinating an unidentified red flower at ºN ºW in May-2007 data model able to accommodate all kinds of raw data without loss of information or generalization

 Computational resources: for example, a small country like Portugal may have ca species which can all potentially interact  relations are directional and may be of several types and subtypes: ▪ Feeding on ▪ Parasitizing ▪ Dispersing ▪ Pollinating ▪ Co-occurring ▪...  … and relations may have different weights/strengths (e.g. species A is more frequently found feeding on species B than on C)  … and occurr at different places and dates  … and pertain to different body parts

 Computational resources  The network easily attains great complexity because ▪ different data facets are covered – taxonomy, morphology, ontology,... ▪ data is highly structured in each facet ▪ relationship occurrences are stored – not “conceptual” relationships – which leads to large amounts of data accumulating over time  but it needs to be efficiently traversed and summarized in an intelligible and meaningful way

To build a virtual lab infrastructure for  storing ecological relationship data  conducting network-based analyses  testing ecological network-based hypotheses aim

 Data is either compiled from published studies or from direct observation (citizen science platform)  Provides services and interfaces for querying, visualizing, summarizing and analyzing the network

 Highly flexible as to the nature of underlying data:  relations may be solely “conceptual” without further details, as obtained from classical bibliography (e.g. species A parasitizes species B), but this is far from ideal  relations may have precise geographical coordinates and timestamp  relations may connect any two entities of any taxonomical rank (e.g. species, genus, family, order...)  relations may connect entities which are not necessarily taxonomic (e.g. arbitrary trait-based entities)  relations may refer to precise organs, structures or life stages (e.g. caterpillar feeding on the leaves of species A)

 A virtual lab infrastructure for conducting ecological network-based analyses  Analyze the spatial patterns of relations and their relationship with environmental drivers, and predict network-level changes upon environmental change  Infer functional relationship patterns from documented relationship occurrences  Predict the n-th order impacts of removing nodes (e.g. species) in the integrity of ecological networks  Test the ecological significance of observed relationship patterns using simulated random networks