Query Processing and Online Architectures T. Yang 290N 2014 Partially from Croft, Metzler & Strohman‘s textbook.

Slides:



Advertisements
Similar presentations
Apache ZooKeeper By Patrick Hunt, Mahadev Konar
Advertisements

Wait-free coordination for Internet-scale systems
Tableau Software Australia
Distributed Processing, Client/Server and Clusters
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
HUG – India Meet November 28, 2009 Noida Apache ZooKeeper Aby Abraham.
© Copyright 2012 STI INNSBRUCK Apache Lucene Ioan Toma based on slides from Aaron Bannert
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Reliability on Web Services Presented by Pat Chan 17/10/2005.
Flavio Junqueira, Mahadev Konar, Andrew Kornev, Benjamin Reed
Distributed components
Information Retrieval in Practice
Task Scheduling and Distribution System Saeed Mahameed, Hani Ayoub Electrical Engineering Department, Technion – Israel Institute of Technology
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Distributed Computations
2/25/2004 The Google Cluster Architecture February 25, 2004.
Web Server Hardware and Software
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Overview of Search Engines
Understanding and Managing WebSphere V5
Google Distributed System and Hadoop Lakshmi Thyagarajan.
INTRODUCTION TO WEB DATABASE PROGRAMMING
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
MAHADEV KONAR Apache ZooKeeper. What is ZooKeeper? A highly available, scalable, distributed coordination kernel.
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
JavaScript, Fourth Edition Chapter 12 Updating Web Pages with AJAX.
1 NETE4631 Using Google Web Services and Using Microsoft Cloud Services Lecture Notes #7.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
Latest Relevant Techniques and Applications for Distributed File Systems Ela Sharda
Distributed Systems: Concepts and Design Chapter 1 Pages
Copyright, 1996 © Dale Carnegie & Associates, Inc. Presented by Hsiuling Hsieh Christine Liu.
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
Revolutionizing enterprise web development Searching with Solr.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.
Open Search Office Web Services Database Doc Mgt Sys Pipeline Index Geospatial Analysis Text Search Faceting Caching Query parsing Clustering Synonyms.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Curtis Spencer Ezra Burgoyne An Internet Forum Index.
The Anatomy of a Large-Scale Hyper textual Web Search Engine S. Brin, L. Page Presenter :- Abhishek Taneja.
GUIDED BY DR. A. J. AGRAWAL Search Engine By Chetan R. Rathod.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
The College of Saint Rose CSC 460 / CIS 560 – Search and Information Retrieval David Goldschmidt, Ph.D. from Search Engines: Information Retrieval in Practice,
CS 347Notes101 CS 347 Parallel and Distributed Data Processing Distributed Information Retrieval Hector Garcia-Molina Zoltan Gyongyi.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
Clusterpoint Margarita Sudņika ms RDBMS & NoSQL Databases & tables → Document stores Columns, rows → Schemaless documents Scales UP → Scales UP.
Apache Solr Dima Ionut Daniel. Contents What is Apache Solr? Architecture Features Core Solr Concepts Configuration Conclusions Bibliography.
ZOOKEEPER. CONTENTS ZooKeeper Overview ZooKeeper Basics ZooKeeper Architecture Getting Started with ZooKeeper.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Indexing The World Wide Web: The Journey So Far Abhishek Das, Ankit Jain 2011 Paper Presentation : Abhishek Rangnekar 1.
BIG DATA/ Hadoop Interview Questions.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Information Retrieval in Practice
Information Retrieval in Practice
Search Engine Architecture
Netscape Application Server
Advanced Topics in Concurrency and Reactive Programming: Case Study – Google Cluster Majeed Kassis.
Information Retrieval in Practice
Spark Presentation.
CHAPTER 3 Architectures for Distributed Systems
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
PHP / MySQL Introduction
Shen, Yang, Chu, Holliday, Kuschner, and Zhu
Cluster Load Balancing for Fine-grain Network Services
Pig Hive HBase Zookeeper
Presentation transcript:

Query Processing and Online Architectures T. Yang 290N 2014 Partially from Croft, Metzler & Strohman‘s textbook

Content Query processing flow and data distribution. Experience with Ask.com online architecture  Service programming with Neptune.  Zookeeper

Query Processing Document-at-a-time  Calculates complete scores for documents by processing all term lists, one document at a time Term-at-a-time  Accumulates scores for documents by processing term lists one at a time Both approaches have optimization techniques that significantly reduce time required to generate scores

Document-At-A-Time

Term-At-A-Time

Optimization Techniques Term-at-a-time uses more memory for accumulators, data access is more efficient Optimization  Read less data from inverted lists –e.g., skip lists –better for simple feature functions  Calculate scores for fewer documents  Threshold-based elimination –Avoid to select documents with a low score when high-score documents are available.

Other Approaches Early termination of query processing  ignore high-frequency word lists in term-at-a-time  ignore documents at end of lists in doc-at-a-time  unsafe optimization List ordering  order inverted lists by quality metric (e.g., PageRank) or by partial score  makes unsafe (and fast) optimizations more likely to produce good documents

Distributed Evaluation Basic process  All queries sent to a coordination machine  The coordinator then sends messages to many index servers  Each index server does some portion of the query processing  The coordinator organizes the results and returns them to the user Two main approaches  Document distribution –by far the most popular  Term distribution Index server coordinator

Distributed Evaluation Document distribution  each index server acts as a search engine for a small fraction of the total collection  A coordinator sends a copy of the query to each of the index servers, each of which returns the top-k results  results are merged into a single ranked list by the coordinator Index server Documents

Distributed Evaluation Term distribution  Single index is built for the whole cluster of machines  Each inverted list in that index is then assigned to one index server –in most cases the data to process a query is not stored on a single machine  One of the index servers is chosen to process the query –usually the one holding the longest inverted list  Other index servers send information to that server  Final results sent to director Index server Terms/postings

Query Popularity: Zipf Query distributions similar to Zipf

Caching for popular queries Explore Zipf query distributions  Over 50% of queries repeat  cache hit  Some hot queries (head queries) are very popular. Caching can significantly improve response time  Cache popular query results  Cache common inverted lists Inverted list caching can help with unique queries Cache must be refreshed to prevent stale data

Open-Source Search Engines Apache Lucene/Solr:  full-text search with highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search  distributed search and index replication.  Based on Java Apache Lucene search. Constellio: Open-source enterprise level search based on Solr. Zoie: sna-projects.com/zoie/ – Real time search indexing built on top of Lucene.sna-projects.com/zoie/

Open-Source Search Engines Lemur  C/C++, running on Linux/Mac and windows.  Indri search engine by U. Mass/CMU.  Parses PDF, HTML, XML, and TREC documents. Word and PowerPoint parsing (Windows only).  UTF-8 Sphinx:  Cross platform open source search server written in C++  search across various systems, including database servers and NoSQL storage and flat files. Xapian: xapian.org/ – search library built on C++xapian.org/

Fee-based Search Solutions Google SiteSearch  Site search is aimed primarily at websites, and not for an intranet.  It is a fully hosted solution  Pricing for site search is on a query basis per year. Starting at $100 for 20,000 queries a year Google Mini  a server based solutions. Once deployed, Mini crawls your Web sites and file systems / internal databases,  Costs start at $1,995 (direct) plus a $995 yearly fee after the first year for indexing of 50,000 documents, and scales upwards

8/27/ Ask.com Search Engine Neptune Document Abstract Cache Frontend Client queries Traffic load balancer Cache Frontend Aggregator Retriever Document Abstract Document Abstract Document description Ranking Graph Server PageInfo Suggestion XML Cache PageInfo Aggregator PageInfo (HID) XML Cache XML Cache

8/27/2015 Research Presentation 17 Frontends and Cache Front-ends  Receive web queries.  Direct queries through XML cache, compressed result cache, database retriever aggregators, page clustering/ranking,  Then present results to clients (XML). XML cache :  Save previously-queried search results (dynamic Web content).  Use these results to answer new queries. Speedup result computation by avoiding content regeneration Result cache  Contain all matched URLs for a query.  Given a query, find desired part of saved results. Frontends need to fetch description for each URL to compose the final XML result.

8/27/ Index Matching and Ranking Retriever aggregators (Index match coordinator)  Gather results from online database partitions.  Select proper partitions for different customers. Index database retrievers  Locate pages relevant to query keywords.  Select popular and relevant pages first.  Database can be divided as many content units Ranking server  Classify pages into topics & Rank pages Snippet aggregators  Combine descriptions of URLs from different description servers. Dynamic snippet servers  Extract proper description for a given URL.

8/27/ Programming Challenges for Online Services Challenges/requirements for online services:  Data intensive, requiring large-scale clusters.  Incremental scalability.  7  24 availability.  Resource management, QoS for load spikes. Fault Tolerance:  Operation errors  Software bugs  Hardware failures Lack of programming support for reliable/scalable online network services and applications.

8/27/ The Neptune Clustering Middleware Neptune: Clustering middleware for aggregating and replicating application modules with persistent data. A simple and flexible programming model to shield complexity of service discovery, load scheduling, consistency, and failover management for code, papers, documents.  K. Shen, et. al, USENIX Symposium on Internet Technologies and Systems,  K Shen et al, OSDI PPoPP 2003.

8/27/ Example: a Neptune Clustered Service: Index match service Snippet generation Index match Front-end Web Servers Ranking Local- area Network HTTP server Neptune Client Neptune server Client Neptune server App

8/27/ Neptune architecture for cluster-based services Symmetric and decentralized:  Each node can host multiple services, acting as a service provider (Server)  Each node can also subscribe internal services from other nodes, acting as a consumer (Client) –Advantage: Support multi-tier or nested service architecture Neptune components at each node:  Application service handling subsystem.  Load balancing subsystem.  Service availability subsystem. Client requests Service provider

8/27/ Inside a Neptune Server Node (Symmetry and Decentralization) Network to the rest of the cluster Service Access Point Service Providers Service Runtime Service Handling Module Service Availability Directory Service Availability Publishing Service Availability Subsystem Polling Agent Load Index Server Service Load-balancing Subsystem Service Consumers

8/27/ Availability and Load Balancing Availability subsystem:  Announcement once per second through IP multicast;  Availability info kept as soft state, expiring in 5 seconds;  Service availability directory kept in shared- memory for efficient local lookup. Load-balancing subsystem:  Challenging: medium/fine-grained requests.  Random polling with sampling.  Discarding slow-responding polls

8/27/ Programming Model in Neptune Request-driven processing model: programmers specify service methods to process each request. Application-level concurrency: Each service provider uses a thread or a process to handle a new request and respond. Service method Data Requests RUNTIME

8/27/ Cluster-level Parallelism/Redudancy Large data sets can be partitioned and replicated. SPMD model (single program/multiple data). Transparent service access: Neptune provides runtime modules for service location and consistency. Service method Request Provider module Provider module … Service cluster Clustering by Neptune Data

8/27/ Service invocation from consumers to service providers Request/response messages:  Consumer side: NeptuneCall(service_name, partition_ID, service_method, request_msg, response_msg);  Provider side: “service_method” is a library function. Service_method(partitionID, request_msg, result_msg);  Parallel invocation with aggregation Stream-based communication: Neptune sets up a bi- directional stream between a consumer and a service provider. Application invocation uses it for socket communication. Consumer Neptune Consumer module Neptune Provider module Service provider

8/27/ Code Example of Consumer Program 1.Initialize Hp=NeptuneInitClt(LogFile); 2. Make a connection NeptuneConnect (Hp, “IndexMatch”, 0, Neptune_MODE_READ, “IndexMatchSvc”, &fd, NULL); 3. Then use fd as TCP socket to read/write data 4. Finish. NeptuneFinalClt(Hp); Consumer IndexMatch Service provider Partition 0

8/27/ Example of server-side API with stream- based communication Server-side functions Void IndexMatchInit(Handle) Initialization routine. Void IndexMatchFinal(Handle) Final processing routine. Void IndexMatchSvc(Handle, parititionID, ConnSd) Processing routine for each indexMatch request. IndexMatch Service provider Partition

8/27/ Publishing Index Search Service Example of configuration file [IndexMatch] SVC_DLL = /export/home/neptune/IndexTier2.so LOCAL_PARTITION = 0,4 # Partitions hosted INITPROC=IndexMatchInit FINALPROC=IndexMatchFinal STREAMPROC=IndexMatchSvc

8/27/ ZooKeeper Coordinating distributed systems as “zoo” management  Open source high-performance coordination service for distributed applications  Naming  Configuration management  Synchronization  Group services

Data Model Hierarchal namespace (like a metadata file system)‏ Each znode has data and children data is read and written in its entirety Persistent vs. Ephemeral: the znode will be deleted when the creating client's session times out or it is explicitly deleted

Zookeeper Operations OperationDescription create Creates a znode (the parent znode must already exist) delete Deletes a znode (the znode must not have any children) exists Tests whether a znode exists and retrieves its metadata getACL, setACL Gets/sets the ACL (access control list) for a znode getChildren Gets a list of the children of a znode getData, setData Gets/sets the data associated with a znode sync Synchronizes a client’s view of a znode with ZooKeeper

Zookeeper: Distributed Architecture Ordered updates and strong persistence guarantees Conditional updates Watches for data changes Ephemeral nodes Generated file names add what we do need: Start with support for a file API 1)Partial writes/reads ‏ 2)Rename