Presentation is loading. Please wait.

Presentation is loading. Please wait.

Worldwide Lexicon Brian McConnell May, 2002. WWL – Brian McConnell Worldwide Lexicon Intro Automatic discovery of dictionary, semantic net and translation.

Similar presentations


Presentation on theme: "Worldwide Lexicon Brian McConnell May, 2002. WWL – Brian McConnell Worldwide Lexicon Intro Automatic discovery of dictionary, semantic net and translation."— Presentation transcript:

1 Worldwide Lexicon Brian McConnell May, 2002

2 WWL – Brian McConnell Worldwide Lexicon Intro Automatic discovery of dictionary, semantic net and translation servers throughout the net Creates standard client/server interface for communicating with servers Creates distributed human computing grid (allows servers to poll idle users to enter data, score recent submissions) “GNUtella for language services”

3 WWL – Brian McConnell What WWL Does Creates a SOAP based interface for locating and communicating with language services Creates mechanism for discovering WWL servers on the fly Allows any application to talk to language servers with a few lines of code Allows existing dictionaries and MT systems to expose their data via WWL Creates something similar to SETI@Home, except it taps idle users to contribute knowledgeSETI@Home Creates a web services API for language services

4 WWL – Brian McConnell What WWL Does Not Do Does not create a global, centrally managed dictionary (WWL is a P2P network of dictionaries and language servers) WWL does not provide machine translation services (although WWL can be used to talk to existing MT servers) WWL does not compete with existing dictionaries or translation services. It makes existing systems more accessible to applications and their users. WWL does not specify details about how dictionary and MT server internal processes

5 WWL – Brian McConnell Some Example Applications Browser and text editor plug ins Extended dictionaries for machine translation systems Human assisted document translation Lexicon@Home client (polls users to enter data when they’re not busy)Lexicon@Home Multilingual chat clients (poll WWL data sources as needed to assist with translations) Real-time translation (via Jabber or SMS) Teaching aids User supported dictionaries and translation memories

6 WWL – Brian McConnell Worldwide Lexicon Protocol Built upon the Simple Object Access Protocol Applications communicate via a small set of SOAP methods HTTP CGI interface also used for data entry and user peer review Goal: allow developers to locate and query any WWL data source with a few lines of code.

7 WWL – Brian McConnell Protocol Overview Three types of methods WWL server discovery and network status methods WWL client/server query methods Utility functions About a dozen methods overall

8 WWL – Brian McConnell System Overview Four basic types of nodes Supernodes (directory servers) WWL servers (dictionaries, MT servers, semantic nets) Gateways (allow non-WWL servers to present WWL front end) Client apps (plug ins, IM clients, Lexicon@Home, etc) Lexicon@Home

9 WWL – Brian McConnell WWL Server Discovery Client app contacts a WWL supernode Invokes WWLFindServers() to fetch list of active servers and gateways that can process client’s request Supernode replies with a list of WWL servers, as well as information about each server’s capabilities WWL servers and gateways announce selves to supernodes at startup via WWLRegister() and WWLServerStatus() methods

10 WWL – Brian McConnell WWL Supernodes Track current status of WWL servers and their peers (servers send registration and status messages) Client apps use supernodes to locate WWL servers and gateways on the fly (e.g. locate Spanish-French full-text translation server) Supernodes also provide quality control (known WWL servers are listed first) Anyone can host a supernode (similar to GNUtella directory servers)

11 WWL – Brian McConnell WWL Gateways Translate WWL/SOAP method calls into other formats Can be used to talk to DICT dictionary servers Can be used to talk to proprietary systems Can do screen scraping (e.g. send query to web based MT server via CGI, scrape results from HTML response) Can even be used to cache and index static wordlists, and to make them appear to users as WWL data sources to any WWL client

12 WWL – Brian McConnell Client/Server Communication Three SOAP methods allow clients to submit queries to WWL servers via standard interface. WWL servers reply via SOAP, results are returned to client app in XML data structure WWL interface can co-exist with other interfaces (DICT, web/cgi, WAP, etc)

13 WWL – Brian McConnell Typical Client Session Contacts WWL supernode(s) to fetch list of active WWL servers according to language, services required Contacts top ranked WWL server to perform query (e.g. translate phrase from spanish to french) If query fails, contacts other WWL servers to perform query

14 WWL – Brian McConnell Application Development WWL defines a client/server interface Client and server apps can be developed and tested independently System is complex, but individual components are simple Perfect fit for open source development model

15 WWL – Brian McConnell Server Apps & Projects Updating existing dictionaries and machine translation servers for WWL and Lexicon@Home Building gateway servers that emulate WWL while talking to non-WWL servers (DICT, HTTP, etc) Document translation servers based on Lexicon@Home concept Lexicon@Home

16 WWL – Brian McConnell Client Applications Browser/text editor plug ins WWL chat clients Lexicon@Home clientsLexicon@Home Teaching aids

17 WWL – Brian McConnell Updating Existing Servers As simple as adding a few scripts to respond to SOAP calls (reply via SOAP versus HTML) SOAP/WWL interface co-exists with other front ends WWL server can be read-only, or can allow user data entry through Lexicon@Home initiativeLexicon@Home Allows hundreds of existing dictionaries, encyclopedia and machine translation servers to participate in WWL with minimal effort

18 WWL – Brian McConnell Example: WWL Chat Client Listens to incoming and outgoing messages When user enables translation, IM client uses WWL to contact machine translation servers as needed When user enables dictionary features, IM client assists user in translating words and phrases when composing messages (ideal for users who know a language but are not fluent)

19 WWL – Brian McConnell Lexicon@Home Distributed human computing Users download small client program that polls WWL server(s) for jobs when user is not busy When WWL server has job, it instructs Lexicon@Home client to force browser to form/CGI user (data entry form is generated by WWL server) Lexicon@Home User enters requested information (definition, translation, score for other user’s submission) Each user does small amount of work, with large population system learns at rapid pace

20 WWL – Brian McConnell Quality Control Editorial oversight (WWL servers can require some or all user submissions to be reviewed by editors and trusted users via private CGI form) Randomized peer review (WWL server asks some lexicon@home users to score submissions from the peers.lexicon@home Hybrid system that combines randomized peer review with editorial oversight (editors focus on submissions with ambiguous scores or from unknown users).

21 WWL – Brian McConnell Project Timeline WWL protocol spec is available at www.worldwidelexicon.org www.worldwidelexicon.org Work to develop first generation apps (supernodes, retrofit existing dictionary servers) is underway Work to develop Lexicon@Home client is in progressLexicon@Home Looking for developers to contribute to project

22 WWL – Brian McConnell Development Priorities Stable supernode server Source libraries for use by existing dictionary and translation servers WWL gateway servers (to talk to non-WWL sites) Lexicon@Home clientLexicon@Home Simple client apps (browser plug in, IM client that links to MT servers)

23

24 WWL – Brian McConnell


Download ppt "Worldwide Lexicon Brian McConnell May, 2002. WWL – Brian McConnell Worldwide Lexicon Intro Automatic discovery of dictionary, semantic net and translation."

Similar presentations


Ads by Google