Cyberinfrastructure Geoffrey Fox Indiana University with Linda Hayden Elizabeth City State University April 5 2011 Virtual meeting.

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

Evaluation of Cloud Storage for Preservation and Distribution of Polar Data. Nadirah Cogbill Mentors: Marlon Pierce, Yu (Marie) Ma, Xiaoming Gao, and Jun.
SALSA HPC Group School of Informatics and Computing Indiana University.
Our Goal: To develop and implement innovative and relevant research collaboration focused on ice sheet, coastal, ocean, and marine research. NSF: Innovation.
Clouds from FutureGrid’s Perspective April Geoffrey Fox Director, Digital Science Center, Pervasive.
Summary Role of Software (1 slide) ARCS Software Architecture (4 slides) SNS -- Caltech Interactions (3 slides)
AStudy on the Viability of Hadoop Usage on the Umfort Cluster for the Processing and Storage of CReSIS Polar Data Mentor: Je’aime Powell, Dr. Mohammad.
Indiana University QuakeSim Activities Marlon Pierce, Geoffrey Fox, Xiaoming Gao, Jun Ji, Chao Sun.
Mi-Joung choi, Hong-Taek Ju, Hyun-Jun Cha, Sook-Hyang Kim and J
+ Team Members Evaluation and Implementation of Web 2.0 Technologies in Support of CReSIS Polar and Cyberinfrastructure Research Projects at Elizabeth.
1 Supplemental line if need be (example: Supported by the National Science Foundation) Delete if not needed. Supporting Polar Research with National Cyberinfrastructure.
Dogan Seber, PhD San Diego Supercomputer Center University of California, San Diego I. DLESE Library II. DISCOVER OUR EARTH Earth Science Resources for.
Parallel Data Analysis from Multicore to Cloudy Grids Indiana University Geoffrey Fox, Xiaohong Qiu, Scott Beason, Seung-Hee.
Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,
The Center for Remote Sensing of Ice Sheets (CReSIS) has been compiling Greenland ice sheet thickness data since The airborne program utilizes a.
GIS technologies and Web Mapping Services
Construction of a University Navigational Mobile Application Group Members: Patrina Bly, Robyn Evans, Nadirah Cogbill Group Mentor: Jeff Wood.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
1 Developing a Data Management Plan C&IT Resources for Data Storage and Data Security Patrick Gossman Deputy CIO for Research January 16, 2014.
PolarGrid Geoffrey Fox (PI) Indiana University Associate Dean for Graduate Studies and Research, School of Informatics and Computing, Indiana University.
Software Architecture
Enabling Cloud and Grid Powered Image Phenotyping Nirav Merchant iPlant Collaborative
Utilizing Data Sets from the CReSIS Data Archives to Visualize Greenland Echograms Information in Google Earth 2012 Research Experience for Undergraduates.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
material assembled from the web pages at
Cyberinfrastructure Geoffrey Fox Indiana University.
C ENTER OF E XCELLENCE IN R EMOTE S ENSING E DUCATION AND R ESEARCH M ULTIMEDIA T EAM Research and Implementation of Data Submission Technologies.
RNA-Seq 2013, Boston MA, 6/20/2013 Optimizing the National Cyberinfrastructure for Lower Bioinformatic Costs: Making the Most of Resources for Publicly.
Multi-Channel Radar Depth Sounder (MCRDS) Signal Processing: A Distributed Computing Approach Je’aime Powell 1, Dr. Linda Hayden 1, Dr. Eric Akers 1, Richard.
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
A Cloudy View on Computing Workshop and CReSIS Field Data Accessibility Jerome E. Mitchell Indiana University.
Parallel Applications And Tools For Cloud Computing Environments Azure MapReduce Large-scale PageRank with Twister Twister BLAST Thilina Gunarathne, Stephen.
PROCESSED RADAR DATA INTEGRATION WITH SOCIAL NETWORKING SITES FOR POLAR EDUCATION Jeffrey A. Wood April 19, 2010 A Thesis submitted to the Graduate Faculty.
GEM Portal and SERVOGrid for Earthquake Science PTLIU Laboratory for Community Grids Geoffrey Fox, Marlon Pierce Computer Science, Informatics, Physics.
1 Meadowood January Geoffrey Fox Associate Dean for Research and Graduate Studies, School of Informatics and Computing Indiana.
Implementation of a Polycom VSX 8000 Teleconferencing System: Developing Standards and Practices for Participating in Virtual Conferences.
Hands-On Microsoft Windows Server Implementing Microsoft Internet Information Services Microsoft Internet Information Services (IIS) –Software included.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
1 CReSIS Lawrence Kansas February Geoffrey Fox (PI) Computer Science, Informatics, Physics Chair Informatics Department Director Digital Science.
Biomedical Big Data Training Collaborative biobigdata.ucsd.edu BBDTC UPDATES Biomedical Big Data Training Collaborative biobigdata.ucsd.edu.
Forward Observer In-Flight Dual Copy System Richard Knepper, Matthew Standish NASA Operation Ice Bridge Field Support Research Technologies Indiana University.
Integrating Geographical Information Systems and Grid Applications Marlon Pierce Contributions: Ahmet Sayar,
VMware vSphere Configuration and Management v6
A Comparative Analysis of Localized Command Line Execution, Remote Execution through Command Line, and Torque Submissions of MATLAB® Scripts for the Charting.
 Does the addition of computing cores increase the performance of the CReSIS Synthetic Aperture Radar Processor (CSARP)?  What MATLAB toolkits and/or.
7. Grid Computing Systems and Resource Management
Social Networking for Scientists (Research Communities) Using Tagging and Shared Bookmarks: a Web 2.0 Application Marlon Pierce, Geoffrey Fox, Joshua Rosen,
U.S. Environmental Protection Agency Central Data Exchange Pilot Project Promoting Geospatial Data Exchange Between EPA and State Partners. April 25, 2007.
REU Site: Arctic and Antarctic Project (AaA-REU) with Research Experience for Teachers (RET) Component I would also like to create a poster on the IU/ECSU.
Using Cloud and Web-based Computing in Geospatial Technology Vince DiNoto JCTC – Dean of College and Systemic Initiatives Co-PI.
This material is based upon work supported by the National Science Foundation under Grant No. ANT Any opinions, findings, and conclusions or recommendations.
Dr. Linda Hayden, Box 672 Elizabeth City State University Elizabeth City, NC Cyberinfrastructure for Remote Sensing.
Theresa Valentine Spatial Information Manager Corvallis Forest Science Lab.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Creating Grid Resources for Undergraduate Coursework John N. Huffman Brown University Richard Repasky Indiana University Joseph Rinkovsky Indiana University.
GIS IN THE CLOUD Cloud computing furnishes scalable GIS technology that is maintained off premises and delivered on demand as services via the Internet.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Flanders Marine Institute (VLIZ)
Real IBM C exam questions and answers
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Project Title Watershed Watch 2007 Elizabeth City State University
Digital Science Center III
Cyberinfrastructure and PolarGrid
Internet Protocols IP: Internet Protocol
QGIS, the data model, use and storage
Watershed Watch 2007 :: Elizabeth City State University
CReSIS Cyberinfrastructure
Project Title Watershed Watch 2013 Elizabeth City State University
Project Title Watershed Watch 2009 Elizabeth City State University
Presentation transcript:

Cyberinfrastructure Geoffrey Fox Indiana University with Linda Hayden Elizabeth City State University April Virtual meeting

Cyberinfrastructure Supports the Expeditions with light weight field system – hardware and system support Then perform offline processing at Kansas, Indiana and ECSU –Indiana and ECSU facilities and initial field work funded by NSF PolarGrid MRI which is now (essentially) completed Initial basic processing to Level 1B Extension to L3 with image processing and data exploration environment Data is archived at NSIDC Prasad Gogineni With the on-site processing capabilities provided by PolarGrid, we are able to quickly identify Radio Frequency Interference (RFI) related problems and develop appropriate mitigation techniques. Also, the on-site processing capability allows us to process and post data to our website within 24 hours after a flight is completed. This enables scientific and technical personnel in the continental United States to evaluate the results and provide the field team with near real-time feedback on the quality of the data. The review of results also allows us to re-plan and re-fly critical areas of interest in a timely manner.

IU Field Support Efforts 2010 OIB Greenland 2010 – RAID-based data back up solution – Second server to handle processing needs – over 50TB collected on-site – copying to Data Capacitor completed at IU in Feb 2011 OIB Punta Arenas 2010 – 20TB using same back up solution

IU Field Support, Spring 2011 OIB and Twin Otter flights simultaneously, two engineers in the field The most equipment IU has sent to the field in any season processing and data transfer server at each site two arrays at each field site Largest set of data capture/backup jobs yet between CReSIS/IU

Field Equipment in detail OIB Thule: 3 2U, 8 core servers, 3 24TB SATA arrays, 12 cases of TB drives Illullissat: 2 2U, 8 core servers, 2 24TB SATA arrays, 6 cases of TB drives 2010 Chile: 3 2U, 8 core servers, 3 24TB SATA arrays, 6 cases of drives 2010 Thule-to-Kanger: 1 2U, 8 core server, 1 24TB SATA array, 6 cases of drives in Thule, 5 in Kanger. Drives in Thule-to-Kanger were re-used drives from earlier Antarctic work and 3 cases failed in Thule. Note 100 drives failed in total so far (its harsh out there)

IU Lower 48 support 2010 data now on Data Capacitor Able to route around local issues if necessary, by substituting other local hardware temporarily Turnaround/management of IU affiliate accounts for CReSIS researchers and students Some tuning of Crevasse (major PolarGrid system at IU) nodes for better job execution/turnaround complete

Education and Cyberinfrastructure

Summer 2010 Cyberinfrastructure REU Joyce Bevins Data Point Visualization and Clustering Analysis Mentors: Jong Youl Choi, Ruan Yang, and Seung-Hee Bae IUB Jean Bevins Creating a Security Model for SALSA HPC Portal Mentors: Adam Hughes, Saliya Ekanayake IUB JerNettie Burney and Nadirah Cogbill Evaluation of Cloud Storage for Preservation and Distribution of Polar Data Mentors: Marlon Pierce, Yu (Marie) Ma, Xiaoming Gao, and Jun Wang IUB Constance Williams Health Data Analysis Mentor: Jong Youl Choi IUB Robyn Evans and Michael Austin Visualization of Ice Sheet Elevation Data Using Google Earth & Python Plotting Libraries Mentors: Marlon Pierce, Yu (Marie) Ma, Xiaoming Gao, and Jun Wang IUB

Academic Year Student Projects A Comparision of Job Duration Utilizing High Performance Computing on a Distributed Grid Members: Michael Austin, JerNettie Burney, and Robyn Evans Mentor: Je'aime Powell Research and Implementation of Data Submission Technologies in Support of CReSIS Polar and Cyberinfrastructure Research Projects at Elizabeth City State University Team Members: Nadirah Cogbill, Matravia Seymore Team Mentor: Jeff Wood Mentors with Xiaoming Gao, Yu "Marie" Ma, Marlon Pierce, Jun Wang at IU JerNettie Burney, Glenn Koch, Jean Bevins, Cedric Hall A Study on the Viability of Hadoop Usage on the Umfort Cluster for the Processing and Storage of CReSIS Polar Data. Mentor: Je'aime Powell

Other Education Activities Two ADMI faculty, one graduate student and one undergraduate student participated in the Cloud Computing Conference CloudCom2010 in Indianapolis December 2010 Fox presented at ADMI Cloud Computing workshop for faculty December Jerome Mitchell (IU PhD, ECSU UG, Kansas Masters) will describe A Cloudy View on Computing Workshop at ECSU June 2011

Supporting Higher Level Data Products Image Processing Data Browsing Portal from Cloud’ Standalone Data Access in the field Visualization

Hidden Markov Method based Layer Finding P. Felzenszwalb, O. Veksler, Tiered Scene Labeling with Dynamic Programming, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010

Current CReSIS Data Organization The data are organized by season. Seasons are broken into data segments which are contiguous blocks of data where the radar parameters do not change. Data segments are broken into frames (typically 50 km in length). Associated data for each frame are stored in different file formats CSV (flight path), MAT (depth sounder data), PDFs (image products). CReSIS data products website lists direct download links for individual files.

PolarGrid Data Browser Goals Organize the data files by its spatial attributes. Support multiple protocols for different user groups, such as KML service and direct spatial database access. Support efficient access methods in different computing and network environments. –Cloud and Field (standalone) versions Support high level spatial analyses functions powered by spatial database

PolarGrid Data Browser Architecture Two main components: Cloud distribution service and special service for PolarGrid field crew. Data syncopation is supported among multiple spatial databases. Google Earth Matlab/GIS GeoServer Spatial Database GeoServer Spatial Database GIS Cloud Service WMS KML Cloud AccessField Access SpatiaLite SQLite Database SpatiaLite SQLite Database Field Service Spatial Database Virtual Appliance Spatial Database Virtual Appliance Data Portal Single User Multiple Users (local network) Multiple Users (local network) Virtual Storage Service Virtual Storage Service

PolarGrid Data Browser: Cloud GIS Distribution Service Google Earth example: 2009 Antarctica season Left image: overview of 2009 flight paths Right image: data access for single frame

Technologies in Cloud GIS Distribution Service Geospatial sever is based on GeoSever and PostGreSQL (spatial database), and configured inside the Ubuntu virtual machine. Virtual storage service attaches terabyte storage to the virtual machine. The Web Map Service (WMS) protocol enables users to access the original data set from Matlab and GIS software. KML distribution is aimed for general users. Data portal are built with Google Map, and can be embedded into any website.

PolarGrid data distribution on Google Earth Processed on cloud using MapReduce

PolarGrid Field Access Service Field crew has limited computing resource and internet connection. Essential data set are downloaded from Cloud GIS distribution service, packed as spatial database virtual appliance with SpatiaLite. The whole system can be carried around on a USB flash drive. Virtual appliance is built on Ubuntu JeOS (just enough operating system), it has almost identical functions as GIS Cloud service, works on local network with VirtualBox. The virtual appliance runs with 256 M virtual memory. SpatiaLite database is a light-weight spatial database based on SQLite. It aims at a single user; –the data can be accessed through GIS software, and a native API for Matlab has also been developed.

PolarGrid Field Access Service SpatiaLite data access with Quantum GIS interface Left image: 2009 Antarctica season vector data, originally stored in 828 separate files. Right image: visual crossover analysis for quality control (work in progress)

Use of Tiled Screens

URL References CReSIS data products: GeoServer: PostgreSQL: VirtualBox: SpatiaLite: Quantum GIS: