Presentation is loading. Please wait.

Presentation is loading. Please wait.

SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Similar presentations


Presentation on theme: "SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project."— Presentation transcript:

1 SilverLining

2 Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project

3 Some context We work at a university Funding based on projects Biodiversity web apps and APIs Focus on software (not hardware)

4 Infrastructure Applications depend on infrastructure Infrastructure that "just works" is expensive More money for infrastructure means less money for application development Degenerates without long-term funding Unreliability is bad for applications Increasingly bad user experience over time

5 $1.6M USD total budget to 17 institutions $245k USD (30.6% of direct costs) for infrastructure

6 $1.6M USD total budget to 17 institutions $245k USD (30.6% of direct costs) for infrastructure $100k USD (12.6% of direct costs) for core application development o DiGIR provider, DiGIR portal

7 MaNIS, ORNIS, HerpNet, FishNet $7.6M USD combined budgets, 71 institutions $196k USD annual operating cost

8 MaNIS, ORNIS, HerpNet, FishNet $7.6M USD combined budgets, 71 institutions $196k USD annual operating cost $179k USD (92%) for infrastructure

9 Infrastructure as a Problem (IaaP)

10 Unsustainable Creates a barrier to innovation And this is all before scaling comes into play!

11 Scalability "The ability for infrastructure to reliably handle heavy request loads in a high performance way."

12 IaaP at scale

13 Scaling up Scale up vertically with a server upgrade Scale out horizontally with more servers

14 Scaling up

15 Scaling DiGIR networks MaNIS, ORNIS, HerpNet, FishNet ~85 million records ~100 servers

16 Scaling DiGIR networks MaNIS, ORNIS, HerpNet, FishNet ~85 million records ~100 servers s

17 Query: All records with a point

18 Response: Error: IO problem

19 "Scaling is hard." - Alex Payne

20 "Scaling is hard." - Alex Payne al3x.net/2010/07/27/node.html

21 Scaling in the small Handling dozens or requests per second Scaling up vertically is sufficient Performance improvements are software related al3x.net/2010/07/27/node.html

22 Scaling in the large Billions of requests per week (Google) Millions of active users (Facebook) Data centers worldwide with millions of servers al3x.net/2010/07/27/node.html

23 Are we scaling large or small? GBIF ~220 million records eBird ~2 million new records per month Undigitized collections ~2.5 billion records

24 Scaling in the "small-ish" We're at the brink! IaaP is in the way, scaling is making it worse Where's the silver lining in all of this?

25 Platform as a Service (PaaS) en.wikipedia.org/wiki/Platform_as_a_service Conceptually quite simple: Computing power over the Internet No servers to maintain Pay for use Scales large (even if your application is small) Provided by companies such as Amazon, Microsoft, Google

26 SilverLining silver-lining.googlecode.com Experiments, metrics, prototypes (not products) Picked Google App Engine PaaS with biodiversity data Simple Darwin Core Bulk loading, storage MapReduce - indexes, validation, statistics Optimize for resource efficiency, search performance

27 Cost comparison Total annual operating costs of vertebrate networks: Current architecture: USD $195,600 Projected App Engine: USD $19,540

28 Cost comparison Total annual operating costs of vertebrate networks: Current architecture: USD $195,600 Projected App Engine: USD $19,540 Total cost for SilverLining work to date: 50 cents

29 App Engine code.google.com/appengine Develop scalable web apps on Google's infrastructure No servers or hardware to maintain and free quotas Standards based Java and Python SDKs IDE support for Eclipse, NetBeans, IntelliJ Local development server Integrated support for unit testing

30 App Engine constraints Practical constraints for performance and scalability The datastore is not a relational database Query can only use inequality filters on 1 property Fails: year >= 1980 and year 10 Solution: Set membership queries

31 Set membership queries Before: year >= 1980 and year 10 After: year "within 1 year" of 1981 and elevation > 10 List for "within 1 year" of 1980: [1979, 1980, 1981]

32 Aggregation and synchronization code.google.com/p/pubsubhubbub code.google.com/apis/feed/push Fast aggregation via API Subscribe to changes at the source Changes pushed automatically

33 What's the end game? PaaS instead of IaaP SaaS (software as a solution) BaaS (biodiversity applications at scale) Aaron Steele asteele@berkeley.edu John Wieczorek tuco@berkeley.edu

34 What's the end game? PaaS instead of IaaP SaaS (software as a solution) BaaS (biodiversity applications at scale) Any QaaC? (Questions as a challenge) Aaron Steele asteele@berkeley.edu John Wieczorek tuco@berkeley.edu


Download ppt "SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project."

Similar presentations


Ads by Google