Big Data: Big Challenges for Computer Science Henri Bal Vrije Universiteit Amsterdam.

Big Data: Big Challenges for Computer Science Henri Bal Vrije Universiteit Amsterdam

Multiple types of data explosions High-volume data 10-100 x global internet traffic per year (by 2018) Complex data

Graphics Processing Units (GPUs)

Differences CPUs and GPUs ● CPU: minimize latency of 1 activity (thread) ● Must be good at everything ● Big on-chip caches ● Sophisticated control logic ● GPU: maximize throughput of all threads using large-scale parallelism Control ALU Cache

Example: NVIDIA Maxwell ● 16 independent streaming multiprocessors ● 2048 compute cores

Ongoing GPU work at VU ● Applications ● Multimedia data ● Digital forensics data ● Climate modelling ● Radio astronomy data ● Methodologies ● Hadoop on accelerators ● Programming methods for accelerators ● Teaching GPUs (with UvA) ● National ICT research infrastructure COMMIT/

Complex data ● Still smaller in volume than astronomy etc. ● Much more complicated, semantically rich data ● Growing fast ….

Semantic web ● Make the Web smarter by injecting meaning so that machines can reason about it ● initial idea by Tim Berners-Lee in 2001 ● Now attracted the interest of big IT companies

WebPIE: a Web-scale Parallel Inference Engine ● Web-scale parallel reasoner doing full materialization ● Orders of magnitude faster than previous work by using smart parallel algorithms ● Jacopo Urbani + Frank van Harmelen (VU) Christiaan Huygens nomination PhD thesis Urbani

Reasoning on changing data ● WebPIE must recompute everything if data changes ● Takes on the order of 1 day on a 64-node compute cluster ● Challenge: real-time incremental reasoning, combining new (streaming) data & historic data ● Nanopublications (http://nanopub.org) ● Handling 2 million news articles per day (Piek Vossen, VU) ● Data streams from (health) sensors & smart phones ● Exploit massive parallel computing and GPUs

Other work on complex data ● Use semantic web to describe and reason about computer infrastructure (Cees de Laat, UvA) ● Machine learning using GPUs (Hadoop) ● Joint work with Max Welling (UvA) ● Business applications ● With Frans Feldberg (VU, Economy)

Discussion ● We can process peta-scale (10 15, LHC) simple data with cluster and grid technology ● Exascale (10 18, SKA) may be feasible with GPUs, but requires new parallel programming methodologies ● Processing complex data is vastly more complicated, even at smaller scales ● Complex data is also escalating in size ● Dynamic (streaming) data will be next ● Processing exa-scale dynamic complex data?

Big Data: Big Challenges for Computer Science Henri Bal Vrije Universiteit Amsterdam.

Similar presentations

Presentation on theme: "Big Data: Big Challenges for Computer Science Henri Bal Vrije Universiteit Amsterdam."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Big Data: Big Challenges for Computer Science Henri Bal Vrije Universiteit Amsterdam.

Similar presentations

Presentation on theme: "Big Data: Big Challenges for Computer Science Henri Bal Vrije Universiteit Amsterdam."— Presentation transcript:

Similar presentations

About project

Feedback