Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Top 10 Reasons Why Federated Can’t Succeed And Why it Will Anyway.

Similar presentations

Presentation on theme: "The Top 10 Reasons Why Federated Can’t Succeed And Why it Will Anyway."— Presentation transcript:

1 The Top 10 Reasons Why Federated Can’t Succeed And Why it Will Anyway

2 But First…  What is our purpose as a community? Produce (wonderful) new ideas Structure the field Educate the workforce

3 A Brief History of Federation  Multibase @1980  Many attempts since Functional Relational Object-oriented Logic-based XML  Still not solved (think of last night)  And never will be?

4 Number 10: Robustness  Systems fail  Sources slow or unavailable  In a distributed system, more pieces => more failures  Users don’t like failures

5 Number 9: Security  Different systems have different security mechanisms Hard to create a single coherent view of permissions  Distributed systems are more vulnerable More points of failure Hard to make security guarantees  Data is often the corporate jewels It must be protected

6 Number 8: Updates  Recording change isn’t always an UPDATE Application semantics must be accounted for Application APIs must be reckoned with  ACIDity isn’t always achievable Not all data sources display ACID properties  Varying degrees of support Strong transaction semantics not always possible or appropriate  And always painful Changes to multiple sources must be coordinated  Requirements for consistency vary

7 Number 7: Configurability  Many architectures possible Even with pre-existing sources, many choices Little or no guidance on tradeoffs  Lots of code to install Federation engine, data source clients Often choices here  Lots of connections to define Need tooling to support

8 Number 6: Administration  Monitoring is hard Not all sources have facilities to track events Variety of mechanisms for different events, and different sources Not always APIs  Tuning is difficult Need to understand what must change Need to take appropriate actions  Repairing is painful Distributed debugging Different vendors to deal with for fixes

9 Number 5: Semantic heterogeneity  Hard to identify commonalities Same terms, different meanings Different terms, same meaning Different structures representing different interpretations  Can’t integrate data effectively without them Can’t make sensible queries

10 Number 4: Insufficient Metadata  Need metadata to integrate, configure, administer and query  Every data source has different metadata No uniform standard Not always collected  Tools to examine and exploit missing

11 Number 3: Performance (Data Movement)  Distributed queries involve moving data  Geographic distribution is common WAN is slow  Large data volumes common Large numbers of objects Large objects  Caching isn’t a complete answer Changes can be frequent and hard to track Storage is not unlimited

12 Number 2: Performance (Complexity)  Decision-support appls do complex queries Many choices for how to execute Big differences in performance among choices  Need data from diverse sources May not have enough power in source Performance at sources may vary  Need expensive functions of data Function may not be implemented everywhere Flowing the data to the function expensive

13 Number 1: Performance (Pathlength)  Simple queries (OLTP-like) incur huge overheads Processing and networking costs  Simple queries are common Easier to write Automatically produced Workflows

14 So Why Will Federated Succeed?  It has to Integration one of the top IT issues  And it’s not going away Alternatives are expensive and/or painful  Write it by hand  EAI/Workflow  Consolidation (warehouse, data marts…)

15 So Why Will Federated Succeed? (2)  Simple scenarios exist Don’t need OLTP, high security, great robustness, … for all applications Customers know their data, or must learn anyway Needs are so great, compromise is possible

16 So Why Will Federated Succeed? (3)  Progress on technology being made 20 years of distributed query processing Plumbing in place  Commit protocols  Reliable messaging  Connectivity infrastructure XML (basic community agreement)  XML data format  XML schema  Web services We’re getting closer

17 What would we do if it ever did work?  Retire  Integrate the web? Data grids Data Google  P2P database?

18 For Discussion  Is research in this area warranted?  What are the most important research topics? Did we miss any?

Download ppt "The Top 10 Reasons Why Federated Can’t Succeed And Why it Will Anyway."

Similar presentations

Ads by Google