Presentation is loading. Please wait.

Presentation is loading. Please wait.

#1 DSC 2014 - Thessaloniki, Greece, 26 September 2014 Rustem South-East European Research Centre (SEERC) International Faculty.

Similar presentations


Presentation on theme: "#1 DSC 2014 - Thessaloniki, Greece, 26 September 2014 Rustem South-East European Research Centre (SEERC) International Faculty."— Presentation transcript:

1 #1 DSC 2014 - Thessaloniki, Greece, 26 September 2014 Rustem Dautovrdautov@seerc.org South-East European Research Centre (SEERC) International Faculty of the University of Sheffield, CITY College Thessaloniki, Greece Iraklis Paraskakisiparaskakis@seerc.org Mike Stannettm.stannett@sheffield.ac.uk Department of Computer Science University of Sheffield Sheffield, United Kingdom

2 #2 DSC 2014 - Thessaloniki, Greece, 26 September 2014 http://www.cellopoint.com/media_resources/blog/2010/04/CelloCloud

3 #3 DSC 2014 - Thessaloniki, Greece, 26 September 2014 “A subclass of PaaS offerings which not only provision customers with an operating system and run-time environment, but additionally offer a complete support to develop and deploy service-based applications, including a range of generic, reliable, composable and reusable services” Dautov, R., Paraskakis, I., & Stannett, M. (2014). Utilising stream reasoning techniques to underpin an autonomous framework for cloud application platforms. Journal of Cloud Computing, 3(1), 1-12.

4 #4 DSC 2014 - Thessaloniki, Greece, 26 September 2014  Service-oriented computing (SOC):  Applications are composed by discovering and invoking network-available services to accomplish some task.  Services are autonomous, platform-independent entities that can be described, published, discovered, and loosely coupled in novel ways.  Data storage, caching and back up, logging, searching, queue messaging, e-billing, authentication, etc.  It is possible to combine internal organisational software assets with external components.  Service-based application systems

5 #5 DSC 2014 - Thessaloniki, Greece, 26 September 2014  A completely different level of complexity (several layers: hardware, virtualisation, software)  IaaS: limited number of standardised metrics  PaaS: a range of metrics related to services  SaaS: infinite number of application-specific metrics

6 #6 DSC 2014 - Thessaloniki, Greece, 26 September 2014  IBM Bluemix  https://www.bluemix.net/  34 built-in, 10 community and 15 third-party services  Google App Engine  https://appengine.google.com/  41 services (“features”)  Heroku  https://www.heroku.com/  Over 100 services (“add-ons”)  Pivotal CloudFoundry  http://www.pivotal.io/platform-as-a-service/pivotal-cf  13 services  Hybrid clouds and Multi-clouds  All this leads to the Big Data challenge!

7 #7 DSC 2014 - Thessaloniki, Greece, 26 September 2014 Multi Skill Training Camp, Rennes, 25-27 June 2014 VOLUME VELOCITY VARIETY VERACITY Large volumes of raw data, typically measured in TBs Raw data is rapidly generated in real time at unpredictable rates Data is heterogeneous at semantic and syntactic levels Data is uncertain, flawed, outdated, and untrusted 7 / 11 Big Data! PageLever is a Heroku-based analytics platform for measuring a brand’s presence on Facebook. It processes 500 M Facebook API requests each month QuizCreator reports activity peaks of over 10000 requests/minute Playtomic has 15-20 M gamers generating over a billion events daily Representation heterogeneity - various data formats and encodings Semantic heterogeneity - two applications may store data in the same database – that is, adopting the same format and structure – but belong to completely separate business domains Reacting to out-dated observations may lead to unnecessary (or even harmful) adaptations Presence of “noise”

8 #8 DSC 2014 - Thessaloniki, Greece, 26 September 2014  The term was coined in 1997 by Cox and Ellsworth  Datasets of 100 GB in size  2.5 EB generated everyday  Total annual Internet traffic will have reached 667 EB in 2013  The ‘so-called’ Four V’s of Big Data:  Volume  Velocity  Variety  Veracity

9 #9 DSC 2014 - Thessaloniki, Greece, 26 September 2014  Deployed apps and CAP components are treated as software sensors  Streaming monitored data is uniformly represented in RDF triples  Critical situations are detected using:  Stream reasoning (i.e. C- SPARQL)  Static reasoning (OWL/SWRL)  Scalability issue!

10 #10 DSC 2014 - Thessaloniki, Greece, 26 September 2014  A software platform for development and execution of applications that process continuously flowing data  Supports dynamic analysis of massive volumes of incoming data to improve the speed of business insight and decision making  Consists of:  Programming language (Streams Programming Language – SPL)  Integrated development environment (IDE)  Runtime environment that can execute SPL applications in stand-alone or distributed modes.

11 #11 DSC 2014 - Thessaloniki, Greece, 26 September 2014 A wide selection of pre-compiled operators, ranging from simple utility operators to more complex ones Custom operators – can be written in SPL, Java or C++ Developing via the drag-and-drop mechanism or directly coding in SPL Support for scalability – can run locally, in a distributed cluster of nodes, or a cloud Automated “zoo keeping” Task parallelisation – built-in operators responsible for splitting/filtering/merging of data streams

12 #12 DSC 2014 - Thessaloniki, Greece, 26 September 2014  Minimal effort to integrate with our framework  Support for processing streamed data  Support for data stream fragmentation and task parallelisation  Capacity to address the Big Data challenges  Mature level of functionality, support and documentation  Free access for research and academic purposes

13 #13 DSC 2014 - Thessaloniki, Greece, 26 September 2014

14 #14 DSC 2014 - Thessaloniki, Greece, 26 September 2014  We split the incoming data stream into three and processed them in parallel  We managed to keep the reaction time of the system within 1 second  Experiments can be reproduced at a larger scale  Partitioning has to be performed carefully:  Unbounded nature of streams  Semantics connections between RDF triples

15 #15 DSC 2014 - Thessaloniki, Greece, 26 September 2014  Cloud platforms are not just a convenient environment for running Big Data applications  They are Big Data generators in their own right  Various Big Data processing techniques have to be utilised  Our approach utilises existing stream processing techniques  Stream Reasoning  IBM Streams

16 #16 DSC 2014 - Thessaloniki, Greece, 26 September 2014


Download ppt "#1 DSC 2014 - Thessaloniki, Greece, 26 September 2014 Rustem South-East European Research Centre (SEERC) International Faculty."

Similar presentations


Ads by Google