Presentation is loading. Please wait.

Presentation is loading. Please wait.

CC5212-1 P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2014 Aidan Hogan Wrap-Up.

Similar presentations


Presentation on theme: "CC5212-1 P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2014 Aidan Hogan Wrap-Up."— Presentation transcript:

1 CC5212-1 P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2014 Aidan Hogan aidhog@gmail.com Wrap-Up

2 Course Marking 45% for Weekly Labs (~3% a lab!) 35% for Final Exam 20% for Small Class Project

3 Final Exam (35%) Next Tuesday, 9am, – Room to be confirmed Goal: test your understanding of concepts – Coding covered by labs/project – No syntax writing questions! but there will be design and syntax reading questions Max. three hours (it won’t take that long !) – Not marking you on English – If really stuck, write in Spanish! Four questions (marked on best three) …

4 The following is not a legally abiding agreement. It is just a helpful guide for what’s important.

5 Q1: Distributed Systems

6 Q1: Distributed Systems (Slides) Slides: [02] MDP-02-Intro-Dist-Sys-20140317.pptx [03] MDP-03-Fallacies-CAP-20140324.pptx [04] MDP-04-Consensus-Paxos-20140331.pptx e.g., [02, S. 3–7] = Slide deck [02], slides 3 to 7 Names per the homepage: http://aidanhogan.com/teaching/cc5212-1/ Slides indicated are only a guide!

7 Q1: Distributed Systems (Topics) Possible Topics: Advantages/disadvantages of a distributed system [02, S. 3–7] Five distributed system design goals [02, S. 9–11] Distributed architectures (P2P vs. C–S, Fat/Thin, n-Tier, etc.) [02, S. 14–31] Java RMI (high-level) [02, S. 47–59] Eight fallacies of distributed computing [03, S. 12–20] Consensus basics (fail-stop vs. Byzantine, synchronous vs. asynchronous, goals) [04, S. 04–22] Consensus protocols (2PC, 3PC, Paxos) [04, S. 25–55] CAP theorem will appear in Q4

8 GFS (HDFS) / MapReduce (Hadoop)

9 Q2: GFS (HDFS) / MapReduce (Hadoop) Slides: [05] MDP-05-DFS-MapReduce-20140407.pptx [06] MDP-06-Hadoop-20140414.pptx [07] MDP-07-Pig-20140421.pptx e.g., [02, S. 3–7] = Slide deck [02], slides 3 to 7 Names per the homepage: http://aidanhogan.com/teaching/cc5212-1/ Slides indicated are only a guide!

10 Q2: GFS (HDFS) / MapReduce (Hadoop) Possible Topics: Google File System (reads, writes, fault-tolerance) [05, S. 11–27] MapReduce (incl. design question) [05, S. 36–46; 06 S. 11–16] HDFS/Hadoop (architecture) [06, S. 18–20] Pig (high-level, give result from input and script) [07]

11 Information Retrieval

12 Q3: Information Retrieval (Slides) Slides: [08] MDP-08-Search-20140428.pptx [09] MDP-09-Ranking-20140505.pptx e.g., [02, S. 3–7] = Slide deck [02], slides 3 to 7 Names per the homepage: http://aidanhogan.com/teaching/cc5212-1/ Slides indicated are only a guide!

13 Q3: Information Retrieval (Topics) Possible Topics: Crawling (high-level multi-threading, (D)DoS, robots.txt, sitemap, distribution, bow-tie) [08, S. 18–32] Inverted indexes (data structure, normalisation, Heap’s law, Ziph’s law, Elias encoding, etc.) [08, S. 36–51] Ranking (relevance vs. importance, TF-IDF, Vector Space Model, etc.) [09, S. 09–31] PageRank (concept, random surfer, calculation) [09, S. 35–56]

14 Bring a Calculator!

15 NoSQL and Querying

16 Q4: NoSQL and Querying (Slides) Slides: [03] MDP-03-Fallacies-CAP-20140324.pptx [10] MDP-10-Intro-to-NoSQL-20140512.pptx [11] MDP-11-BigTable+Cassandra.pptx e.g., [02, S. 3–7] = Slide deck [02], slides 3 to 7 Names per the homepage: http://aidanhogan.com/teaching/cc5212-1/ Slides indicated are only a guide!

17 Q4: NoSQL and Querying (Topics) Possible Topics: CAP theorem [03, S. 23–39] (<- note out of order) The Database Landscape [10, S. 10] Key–Value stores (data model, operations, distribution, consistent hashing, replication, Dynamo, Merkle trees) [10, S. 18–38] Document stores (high-level) [10, S. 44–45] Tabular/column-families (data model, Bigtable, sorting, tablets, column families, SSTables, writes, reads, compactions, hierarchy, bloom filters [11, S. 17–36] Graph databases (high-level) [11, S. 45–52] Cassandra (high-level) [11, S. 61–69]

18 Final Exam (35%) Recap Next Tuesday, 9am, – Room to be confirmed Goal: test your understanding of concepts – Coding covered by labs/project – No syntax writing questions! but there will be design and syntax reading questions Max. three hours (it won’t take that long !) – Not marking you on English – If really stuck, write in Spanish! Four questions (marked on best three) …

19


Download ppt "CC5212-1 P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2014 Aidan Hogan Wrap-Up."

Similar presentations


Ads by Google