BXGrid: A Data Repository and Computing Grid for Biometrics Research Hoang Bui University of Notre Dame 1.

Slides:



Advertisements
Similar presentations
Condor use in Department of Computing, Imperial College Stephen M c Gough, David McBride London e-Science Centre.
Advertisements

1 Scaling Up Data Intensive Scientific Applications to Campus Grids Douglas Thain University of Notre Dame LSAP Workshop Munich, June 2009.
1 Real-World Barriers to Scaling Up Scientific Applications Douglas Thain University of Notre Dame Trends in HPDC Workshop Vrije University, March 2012.
Tactical Storage: Simple, Secure, and Semantic Access to Remote Data Prof. Douglas Thain University of Notre Dame
University of Notre Dame
1 AMS 200 Week 3 Computing Resources and Electronic Resources Hongyun Wang.
Research Issues in Cooperative Computing Douglas Thain
Scaling Up Without Blowing Up Douglas Thain University of Notre Dame.
The Center for Computational Genomics and Bioinformatics Christopher Dwan Mike Karo Tim Kunau.
11 Decembre 2000V. Breton Milan WP6 DataGRID meeting Biological applications in testbed 0 Evaluate GRID added value for handling biological data –What.
1 Condor Compatible Tools for Data Intensive Computing Douglas Thain University of Notre Dame Condor Week 2011.
1 High Throughput Scientific Computing with Condor: Computer Science Challenges in Large Scale Parallelism Douglas Thain University of Notre Dame UAB 27.
1 Opportunities and Dangers in Large Scale Data Intensive Computing Douglas Thain University of Notre Dame Large Scale Data Mining Workshop at SIGKDD August.
1 Scaling Up Data Intensive Science with Application Frameworks Douglas Thain University of Notre Dame Michigan State University September 2011.
1 Models and Frameworks for Data Intensive Cloud Computing Douglas Thain University of Notre Dame IDGA Cloud Computing 8 February 2011.
1 Science in the Clouds: History, Challenges, and Opportunities Douglas Thain University of Notre Dame GeoClouds Workshop 17 September 2009.
1 Scaling Up Data Intensive Science to Campus Grids Douglas Thain Clemson University 25 Septmber 2009.
Programming Distributed Systems with High Level Abstractions Douglas Thain University of Notre Dame Cloud Computing and Applications (CCA-08) University.
Deconstructing Clusters for High End Biometric Applications NSF CCF June Douglas Thain and Patrick Flynn University of Notre Dame 5 August.
Using Small Abstractions to Program Large Distributed Systems Douglas Thain University of Notre Dame 11 December 2008.
Condor and the Grid D. Thain, T. Tannenbaum, M. Livny Christopher M. Moretti 23 February 2007.
Getting Beyond the Filesystem: New Models for Data Intensive Scientific Computing Douglas Thain University of Notre Dame HEC FSIO Workshop 6 August 2009.
Cooperative Computing for Data Intensive Science Douglas Thain University of Notre Dame NSF Bridges to Engineering 2020 Conference 12 March 2008.
An Introduction to Grid Computing Research at Notre Dame Prof. Douglas Thain University of Notre Dame
Introduction to Makeflow Li Yu University of Notre Dame 1.
High Throughput Computing with Condor at Notre Dame Douglas Thain 30 April 2009.
Using Abstractions to Scale Up Applications to Campus Grids Douglas Thain University of Notre Dame 28 April 2009.
The Chinese University of Hong Kong. Research on Private cloud : Eucalyptus Research on Hadoop MapReduce & HDFS.
Uba Anydiewu, Shane Bilinski, Luis Garcia, Lauren Ragland, Debracca Thornton, Joe Tubesing, Kevin Chan, Steve Elliott, and Ben Petry EXAMINING INTRA-VISIT.
Portable Resource Management for Data Intensive Workflows Douglas Thain University of Notre Dame.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
Massively Parallel Ensemble Methods Using Work Queue Badi’ Abdul-Wahid Department of Computer Science University of Notre Dame CCL Workshop 2012.
Programming Distributed Systems with High Level Abstractions Douglas Thain University of Notre Dame 23 October 2008.
Toward a Common Model for Highly Concurrent Applications Douglas Thain University of Notre Dame MTAGS Workshop 17 November 2013.
Building Scalable Scientific Applications using Makeflow Dinesh Rajan and Douglas Thain University of Notre Dame.
The Cooperative Computing Lab  We collaborate with people who have large scale computing problems in science, engineering, and other fields.  We operate.
Distributed Framework for Automatic Facial Mark Detection Graduate Operating Systems-CSE60641 Nisha Srinivas and Tao Xu Department of Computer Science.
1 Computational Abstractions: Strategies for Scaling Up Applications Douglas Thain University of Notre Dame Institute for Computational Economics University.
A Cross-Sensor Evaluation of Three Commercial Iris Cameras for Iris Biometrics Ryan Connaughton and Amanda Sgroi June 20, 2011 CVPR Biometrics Workshop.
VIPIN VIJAYAN 11/11/03 A Performance Analysis of Two Distributed Computing Abstractions.
Transparently Gathering Provenance with Provenance Aware Condor Christine Reilly and Jeffrey Naughton Department of Computer Sciences University of Wisconsin.
“Live” Tomographic Reconstructions Alun Ashton Mark Basham.
A Fast and Scalable Nearest Neighbor Based Classification Taufik Abidin and William Perrizo Department of Computer Science North Dakota State University.
Streaming Big Data with Self-Adjusting Computation Umut A. Acar, Yan Chen DDFP January 2014 SNU IDB Lab. Namyoon Kim.
Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.
Purdue RP Highlights TeraGrid Round Table May 20, 2010 Preston Smith Manager - HPC Grid Systems Rosen Center for Advanced Computing Purdue University.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
Team Wildcats By: Patrick Kelly And Jesus Flores.
1 Christopher Moretti – University of Notre Dame 4/30/2008 High Level Abstractions for Data-Intensive Computing Christopher Moretti, Hoang Bui, Brandon.
Building Scalable Scientific Applications with Work Queue Douglas Thain and Dinesh Rajan University of Notre Dame Applied Cyber Infrastructure Concepts.
Creating Grid Resources for Undergraduate Coursework John N. Huffman Brown University Richard Repasky Indiana University Joseph Rinkovsky Indiana University.
Dynamic Mobile Cloud Computing: Ad Hoc and Opportunistic Job Sharing.
Fundamental Operations Scalability and Speedup
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Pegasus WMS Extends DAGMan to the grid world
Efficient Image Classification on Vertically Decomposed Data
Scaling Up Scientific Workflows with Makeflow
Unsupervised Face Alignment by Robust Nonrigid Mapping
Efficient Image Classification on Vertically Decomposed Data
Haiyan Meng and Douglas Thain
Weaving Abstractions into Workflows
A Fast and Scalable Nearest Neighbor Based Classification
What's New in eCognition 9
BXGrid: A Data Repository and Computing Grid for Biometrics Research
Approaching an ML Problem
Creating Custom Work Queue Applications
Chaitali Gupta, Madhusudhan Govindaraju
The Gamma Operator for Big Data Summarization on an Array DBMS
What's New in eCognition 9
Presentation transcript:

BXGrid: A Data Repository and Computing Grid for Biometrics Research Hoang Bui University of Notre Dame 1

Overview Biometrics Research What is BXGrid? BXGrid & Condor Future Works Questions 2

Biometric Research Facial recognition Iris recognition 3

Acquisition process Computer Vision Research Laboratory 4

5

Biometric Research Now what? – I have collected 100,000 irises. – I have an algorithm to compare 2 irises – I want evaluate my algorithm by comparing only brown irises – First, I need to convert raw iris images to iris codes – But I need to find all brown irises 6

BXGrid How do I search for brown irises fast? Where do I store iris images? How do I evaluate my algorithm? DBMS Relational Database (2x) Active Storage Cluster (16x) CPU Relational Database CPU Condor Pool (500x) 7

8

9

10

11

12

Workflow Abstractions B1 B2 B3 A1A2A3 FFF F FF FF F Lbrown Lblue Rbrown R S1 S2 S3 eyecolor F F F ROC Curve S = Select( color=“brown” ) B = Transform( S,F ) M = AllPairs( A, B, F ) Bui, Thomas, Kelly, Lyon, Flynn, Thain BXGrid: A Repository and Experimental Abstraction… poster at IEEE eScience

Transform Abstraction B = Transform( S,F ) Transform set S into set B using function F Single PC and 100,000 iris images – Core 2 Duo 1.8Ghz 1GB RAM PC – 6 seconds/transform  170 hours – Storage: 30GB Let’s use Condor You want to: – Do it faster – Manage resource properly 14

Fileservers J1 Condor pool J2J3JJJ1JN User Local Machine 15

Fileservers J1 Condor pool J2J3JJJ1JN User Local Machine Wait() J2 JN+1 16

Result 17

18

Transform Summary Use up to 1GB local storage Transform 10,000 irises – Single PC: 60,000 seconds – Condor: 1400 seconds Speedup: ~43 times 19

AllPairs Abstraction AllPairs( set A, set B, function F ) returns matrix M where M[i][j] = F( A[i], B[j] ) for all i,j B1 B2 B3 A1A2A3 FFF A1 An B1 Bn F AllPairs(A,B,F) F FF FF F 20

AllPairs Result 10,000 irises vs. 10,000 irises Condor pool: 32 nodes AllPairs took 150 minutes to complete 100,000,000 comparisons Speedup: ~ 7 times 21

ROC Cruve 22

Workflow Summary 23 TransformAllPairs B1 B2 B3 A1A2A3 FFF F FF FF F Condor Iris Iris Code Result Matrix Storage Cluster

Future Works Run bigger Transform & All-Pairs experiments Using Condor to perform Automated Validation Extend the repository for other types of data 24

25

Acknowledgments Cooperative Computing Lab – BXGrid – Grad Students –Chris Moretti –Li Yu –Deborah Thomas –Karen Hollingswort –Tanya Peters Faculty: –Douglas Thain –Patrick Flynn Undergrads & Staff –Mike Kelly –Rory Carmichael –Mark Pasquier –Christopher Lyon –Diane Wright 26

Question 27