User scenario on Marine Biodiversity AquaMaps Pasquale Pagano National Research Council (CNR) – ISTI Italy.

Slides:



Advertisements
Similar presentations
AIP-2 CC&Bio WG Use scenarios Stefano Nativi (CNR) and Gary Geller (NASA/JPL) OGC TC Meeting Valencia, Spain 2 Dec 2008.
Advertisements

User Communities scenarios and achievements Marc Taconet Anton Ellenbroek FAO – Fisheries and Aquaculture Department Nicolas Bailly.
An introduction to climate change vulnerability assessments Stuart Butchart, BirdLife International
WP3 Biomapping results to date WP3: NRM, CDF, CEFAS, DINARA, WCS Additional input: WP1, AquaMaps workgroup.
Tomer Gueta, Avi Bar-Massada and Yohay Carmel Using GBIF data to test niche vs. neutrality theories at a continental scale, and the value of data cleaning.
Cloud Computing Resource provisioning Keke Chen. Outline  For Web applications statistical Learning and automatic control for datacenters  For data.
A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.
Community fair Introduction of the D4Science communities, their challenges and the synergies M. Taconet (FAO) - chair L. Fusco (ESA), N. Bailly (WFC),
Mapping Life on Earth: Recent Progress with AquaMaps Rainer Froese IFM-GEOMAR, Kiel, Germany Teldap Conference, Taipei 2 March 2009.
L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
Winners and Losers in the Future Ocean Insights from Millions of Samples Rainer Froese IFM-GEOMAR, Kiel, Germany EDIT Symposium 18th January
An approach for solving the Helmholtz Equation on heterogeneous platforms An approach for solving the Helmholtz Equation on heterogeneous platforms G.
SEEK: Enabling Ecology and Biodiversity Science Through Cyberinfrastructure.
Bridging Species Niche Modeling and Multispecies Ecological Modeling and Analysis Jeffery Cavner, J.H. Beach, Aimee Stewart, CJ Grady
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.
1 Electronic Atlas for All Marine Species Rainer Froese IFM-GEOMAR
AquaMaps - behind the scene Josephine “Skit” D. Rius FishBase Project/INCOFISH WP1 WorlFish Center INCOFISH WP3 Technical Workshop Campinas, Brazil
Synopsis of current BIEN and Enquist projects managed by Martha iPlant 2014.
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
AquaMaps Rainer Froese GBIF-Copenhagen 30 January 2008.
Museum and Institute of Zoology PAS Warsaw Magdalena Żytomska Berlin, 6th September 2007.
Role of Spatial Database in Biodiversity Conservation Planning Sham Davande, GIS Expert Arid Communities Technologies, Bhuj 11 September, 2015.
A Fault-Tolerant Environment for Large-Scale Query Processing Mehmet Can Kurt Gagan Agrawal Department of Computer Science and Engineering The Ohio State.
AquaMaps Predictive distribution maps for marine organisms K. Kaschner, J. S. Ready, E. Agbayani, J. Rius, K. Kesner-Reyes, P. D. Eastwood, A. B. South,
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
PREDICTING AND UNDERSTANDING BIOGEOGRAPHIC RANGES FROM OCCURRENCE RECORDS AND CORRELATED ENVIRONMENTAL DATA J. M. Guinottte, J. D. Bartley, A. Iqbal, D.
AquaMaps: Mapping Biodiversity Hotspots and Assessing Impacts of Climate Change K.Kaschner (FAO & Albert-Ludwigs- University of Freiburg), M. Taconet (FAO),
Remote-sensing and biodiversity in a changing climate Catherine Graham SUNY-Stony Brook Robert Hijmans, UC-Berkeley Lianrong Zhai, SUNY-Stony Brook Sassan.
Mapping distributions of marine organisms using environmental niche modelling - AquaMaps K. Kaschner, J. Ready, S. Kullander, R. Froese and many more….INCOFISH,
Building Scientific Workflows for the Fisheries and Aquaculture Management Community based on Virtual Research Environments Pedro Andrade (CERN)
MapReduce & Hadoop IT332 Distributed Systems. Outline  MapReduce  Hadoop  Cloudera Hadoop  Tutorial 2.
User Scenarios in VENUS-C Focus on Structural Analysis Ignacio Blanquer I3M - UPV.
On the D4Science Approach Toward AquaMaps Richness Maps Generation Pasquale Pagano - CNR-ISTI Pedro Andrade.
Managing Virtual Research Environments in Hybrid Data Infrastructures Pasquale Pagano (CNR, Italy) iMarine Technical Director
Example projects using metadata and thesauri: the Biodiversity World Project Richard White Cardiff University, UK
Millions of Jobs or a few good solutions …. David Abramson Monash University MeSsAGE Lab X.
MapReduce Basics Chapter 2 Lin and Dyer & /tutorial/
Cloud Computing for Ecological Modeling in the D4Science Infrastructure A. Manzi (CERN), L. Candela, D. Castelli, G. Coro, P. Pagano, F. Sinibaldi (ISTI-CNR)
Support to scientific research on seasonal-to-decadal climate and air quality modelling Pierre-Antoine Bretonnière Francesco Benincasa IC3-BSC - Spain.
AquaMaps Mapping Marine Biodiversity Rainer Froese IFM-GEOMAR, Kiel, Germany FishBase Symposium 1 September 2008.
WP3 Biomapping summary WP3: NRM, CDF, CEFAS, DINARA, WCS Additional input: WP1, AquaMaps workgroup.
The EUBrazilOpenBio-BioVeL Use Case in EGI Daniele Lezzi, Barcelona Supercomputing Center EGI-TF September 2013.
WP3 Biomapping status quo WP3: NRM, CDF, CEFAS, DINARA, WCS.
Staging of the Ecological Niche Modeling Mammal Prototype Project Deana Pennington University of New Mexico December 14, 2004.
Modelling alien invasives using the GARP system Some thoughts to bear in mind. James J. Reeler Department of Biodiversity and Conservation Biology University.
EGI Technical Forum Madrid The EUBrazilOpenBio-BioVeL Use Case in EGI Daniele Lezzi – BSC EGI Technical Forum Madrid.
Item Based Recommender System SUPERVISED BY: DR. MANISH KUMAR BAJPAI TARUN BHATIA ( ) VAIBHAV JAISWAL( )
Daniele Lezzi Execution of scientific workflows on federated multi-cloud infrastructures IBERGrid Madrid, 20 September 2013.
e-Infrastructure Integration with gCube
Ecological Niche Modelling in the EGI Cloud Federation
Organizations Are Embracing New Opportunities
Sushant Ahuja, Cassio Cristovao, Sameep Mohta
FishBase, SealifeBase, AquaMaps
Pasquale Pagano CNR, Italy
Expanding and Scaling Lifemapper Computations Using CCTools
Brief introduction to the project
GARP Model GARP (Genetic Algorithm for Rule-set Production)
MODELING THE CURRENT AND FUTURE DISTRIBUTIONS OF
Rainer Froese, Kathleen Kesner-Reyes and Cristina Garilao
Introduction to D4Science
Checking and Editing AquaMap Outputs
Species Distribution Models
Modelling alien invasives using the GARP system
WP 10 Service Activities: Access to data products and knowledge
Big data for Global Change Ecology (Biogeography)
Presentation transcript:

User scenario on Marine Biodiversity AquaMaps Pasquale Pagano National Research Council (CNR) – ISTI Italy

Marine Biodiversity AquaMaps AquaMaps VRE is a virtual environment providing set of services for the generation, standardized dissemination, and mapped visualization of model-based, large-scale predictions of currently known occurrence of marine species

Standardized range maps of marine species Ecological Niche Modelling – extrapolation of known species occurrences data to determine environmental envelopes (species tolerances) – predict future distributions by matching species tolerances against local environmental conditions (e.g. climate change and sea pollution) AquaMaps* – Maps large-scale species distributions based on existing but fragmented and potentially non-representative occurrence data – Uses knowledge of the geographic extents of commercial species available from FAO – Uses information about habitat usage available from online species databases * Initially defined to predict global distributions of marine mammals (by Kashner et al ) and then generalised to marine species. AquaMaps is the only species distribution modelling approach that combines numerical algorithms with expert knowledge

HSPENHSPEC HCAF Species Range Maps Production Workflow Color-coded species range map, using a half- degree latitude and longitude dimensions Defining Environmental Envelopes Generating Species Occurrences Probability Plotting Range Maps HSPEN HCAF HSPEC HSPEN Good Cells Biological Species

Species Range Maps Complexity 0.5 latitude and longitude cells (  35 km 2 equator), > 170k marine cells Large volume of input and output data Less than 7,000 species: – native range = 56,468,301 – suitable range = 114,989,360 Estimation for 50,000 species: – native range = 350,000,000 – suitable range = 715,000,000 [Eli E. Agbayani, FishBase Project/INCOFISH] Very large number of computation One Multispecies map computed on 3.5 % of the marine areas and 25% species requires 125 millions computations One global map (extended to all species and marine cells around the world) requires about 400 billions computations [N. Bailly, WorldFish Center]

AquaMaps is a production environment Produce range maps, multispecies maps plus a variety of thematic maps: taxonomic, climatologic, invasiveness, … Allows researchers to evaluate the impact of climate changes (e.g. 2050) Distribute maps via different services – FishBase, the most widely used biological information system with over one million visitors per month; – Species information systems such as SeaLifeBase and GBIF

Current Limitations To evaluate the marine biodiversity changes in presence of human disasters To provide useful information to mitigate the impact of natural disasters on marine biodiversity 0.5 latitude and longitude cells Increase the Precision

Increasing the precision means … Less than 13,000 species Increase resolution 0.5 deg -> 0.1 deg – Environment DBs => 4,5 million rows – Iterations => 55 billion iterations – Species Prediction DBs => 2.5 billion rows Scientists may tweak parameters – Species Prediction up to 55 billion rows (870 Gb) Roadmap to increase the number of species 13,000 -> 50,000 – 4-times more rows  Change the current technologies – Relational databases – Execution models 8

Expected benefits of VENUS-C Execution models  Sequential  Multithreaded  Batch + MapReduce the overall problem is split into smaller chunks which can be processed in parallel. All partial solutions are then combined / reduced into the overall result + COMPSs COMPSs runtime is responsible for processing a single application as remote tasks, i.e. checking their data dependencies and scheduling their concurrent execution on distributed parallel resources + Generic Worker Distribute the load over multiple machines IO  Relational + Hadoop VFS (local fs, http, ftp, s3, kfs, hdfs) + Cassandra + CDMI + Elasticity Management

Conclusions Venus-C allows to – Perform analysis that otherwise could not be undertaken – Increase the efficiency of research – Sharing results in a community in real time – Maximize the data production – Accelerate the research