Presentation is loading. Please wait.

Presentation is loading. Please wait.

DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment March, 2001 Next generation web search and.

Similar presentations


Presentation on theme: "DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment March, 2001 Next generation web search and."— Presentation transcript:

1 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment March, 2001 Next generation web search and Question- answering technology DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment Oct, 2001 Gary Geunbae Lee Dept. of CSE, Postech & DiQuest.com 1

2 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 2 ContentsContents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions

3 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 3 Conventional search engine Directory based  Yahoo: everything  AOL search: web+AOL contents  Directhit: click monitoring for popular site top ranking  Looksmart: human compiled web site directory Search based  Altavista: you know  Excite: you know  Lycos: from search  directory service  Fastsearch: first time 0.2 billion web page indexing  Inktomi: highly scalable indexing system  Google: link analysis (high precision) Current trends: directory+ search integration

4 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 4

5 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 5 Recent NL search and QA systems Internet search with natural language and intelligence  askjeeves: horizontal question-answering  Northernlight: natural language and phrasal search (clustering)  Empas: korean natural language search (?)  Lexiquest: lexipacks: ontology/dictionary for specific domain (context search)  Oingo: meaning oriented search (big ontology) Natural language question answering  Neuromedia (nativeminds): chatter bot (Eliza technology)  Easyask: data-base question answering  Brightware: web, email question answering (faq finding), recommendation  inquizit technology: natural language semantic analysis (concept engine)  YY-software: automatic email answering  Answerlogic: wordnet based question-answering  Answers.com: faq finding

6 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 6 Interaction with customers for e-business Internet users over 130m up to 350m by 2003 (eMarketer) Internet commerce $1.3trillion by 2003 (Forrester research) From e-commerce to e-business Time E-business sophistication contents Transactions communications Intelligent CRM Customer history Purchase likelihood Staffing requirements Prior information history Corporate policy about service etc

7 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 7 Customer interaction channel

8 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 8 CRM architecture – 3 different views Integration of data warehousing & data mining, web call-center, automatic sales and marketing Web-enabled Operational Analytical Collaborative Sales force automation Marketing Automation Field Service Automation Customer Service/Support Data Warehouse Data Mart Marketing Automation Data Marketing Voice(IVR,CTI,ACD) e-Mail Fax/Direct Mail Web Site Source: META Group, June 1999

9 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 9 World wide CRM market Application License $8.3 billion Implementation $5.2 billion SW Maintenance $3.2 billion Year 2003 CRM market

10 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 10 Call Center Inbound CallsOutbound Calls Contact Center Fax WWW / Email Telephone Kiosk Sales Force Automation Direct Mail Call Center solutions: integration of media

11 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 11 ContentsContents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions

12 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 12 NLP technology: eliza scripting "Rule Heading" a:0.2 the rule activation level p:35 *what*keyword* the pattern priority and word pattern r:robot's reply a:0.5 p:60 Wh *your*job* r:I’m a full time Verbot a:0.4 p:30 What time * your * job over. r:I don’t get any time off, I always have to be here available for you.

13 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 13 POS tagging (with morpheme analysis) 포항공대 이근배 교수님께서 신을 신고 신고하러 가신다. [ 0,0 ( 0,0 )] 등 1.000000e+00(1.000000e+00) s ([) [ 1,10( 1,1 )] 미 8.288423e-11(6.102822e-13) MPO ( 포항공대 ) [11,11( 2,2 )] 등 8.736421e-02(2.559207e-20) s (#) [12,18( 3,3 )] 미 9.236515e-08(7.008548e-24) MPN ( 이근배 ) [19,19( 4,4 )] 등 8.736421e-02(2.939022e-31) s (#) [20,23( 5,5 )] 등 4.469725e+00(1.564634e-25) MC ( 교수 ) [24,26( 6,6 )] 등 1.373613e+02(1.504397e-25) - ( 님 ) [27,30( 7,7 )] 등 1.307859e+01(1.831031e-25) jC ( 께서 ) [31,31( 8,8 )] 등 8.736421e-02(7.678394e-33) s (#) [32,34( 9,9 )] 등 3.250709e+00(3.667919e-27) MC ( 신 ) [35,37(10,10)] 등 1.264760e+01(3.865534e-27) jC ( 을 ) [38,38(11,11)] 등 8.736421e-02(1.621005e-34) s (#) [39,41(12,12)] 등 5.807344e+00(1.021970e-28) DR ( 신 ) [42,43(13,13)] 등 3.936314e+01(1.918250e-28) eCC ( 고 ) [44,44(14,14)] 등 8.736421e-02(8.044147e-36) s (#) [45,49(15,15)] 등 8.588220e-04(1.297090e-33) MC ( 신고 ) [50,51(16,16)] 등 2.626376e+01(1.404345e-33) y ( 하 ) [52,56(17,19)] 등 1.445488e+03(1.043073e-31) eCC ( 러 ) [52,56(17,19)] 등 1.445488e+03(1.043073e-31) s (#) [52,56(17,19)] 등 1.445488e+03(1.043073e-31) DI ( 가 ) [57,58(20,20)] 등 4.657808e+01(1.348953e-31) eGS ( 시 ) [59,61(21,21)] 등 1.841659e+01(4.754894e-31) eGE ( ㄴ다 ) [62,64(22,22)] 등 1.250000e-07(1.365400e-38) s. (.) [65,65(23,23)] 등 2.500000e-05(1.638481e-49) s (])

14 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 14 NLP technology: postag example POSTAG architecture Features  Statistics+rule combination  Tight coupling with morpheme analysis  Morpheme graph representation  Pattern dictionary concepts for unknown words  100,000 morpheme dic.  1,500 morpheme pattern dic. Morpheme dic Morpheme pattern dic Morph.AnalyzerMorph.Analyzer Morpheme graph POS tagger Morph adjacency table POS Bigram Syllable Trigram Input sentence ErrorcorrecterErrorcorrecter Error Correction rules Error corrected Morpheme graph Parser, application

15 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 15 Unknown word guessing Unknown word guessing morpheme pattern dic  Syllable constraints for each part of speech in Korean Lexical probabilities for unknown words  Syllable tri-gram equations morpheme anlaysis with unknown word guessing Pattern dic for unknown words Morpheme dic Morpheme pattern dic Morph.AnalyzerMorph.Analyzer Morph adjacency table Input sentence FilterFilter Filtering info. Filtered Morpheme graph POS tagger

16 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 16 Syntactic parsing /complete/end---s ([) complete \ /np---MCC ( 엄마 ) \ /v/(v\np[j 가 ]) \ / \(v/(v\np[j 가 ]))\np---jC ( 가 ) \ /v[D] \ / \ /np---TCH ( 애 ) \ / \ /v/(v\np[j 에게 ]) \ / \ / \(v/(v\np[j 에게 ]))\np---jC ( 에게 ) \ / \v[D]\{np[j 가 ]} \ / \ /np---MCC ( 심부름 ) \ / \ /v/(v\np[j 를 ]) \ / \ / \(v/(v\np[j 를 ]))\np---jC ( 을 ) \ / \v[D]\{np[j 가 ],np[j 에게 ]} \ / \v[D]\{np[j 가 ],np[j 를 ],np[j 에게 ]}---DR ( 시켜 ) \ /vp[ 었 ] \ / \vp[ 었 ]\v---eGSt ( ㅆ ) \ /s[ 서술 ] \ / \s[ 서술 ]\vp---eGEs ( 다 ) \ /s[ 서술 ] \ / \s\s---s. (.) \end \end\X---s (]) 엄마가 아이에게 심부름을 시켰다

17 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 17 Syntactic parsing: pospar example   Functional application X/(Args  {Y}) Y  X/Args Y X\(Args  {Y})  X\Args   Composition X/(X\Args X ) Y/(Y\Args Y )  X/(X\(Args X  Args Y )) Y\Args Y X\(Args X  {Y})  X/(X\(Args X  Args Y ))   Coordination X CONJ X  X   Variable category $v, $vp   Featured category v : D, H, I, E vp : 었, 었었, 고있, 어있, 겠, 더, 시 s : 평서, 의문, 명령, 청유, 약속, 문장 np : j 이, j 를, j 에게 Syntax dic Syntax pattern dic Parse tree Syntactic Analyzer Syntactic Category Trigram Korean CCG Morpheme graph Semantic Analyzer Syntactic dic. and Syntactic pattern dic. POSPAR architecture

18 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 18 Semantic analysis: example 자연어처리를 전공한 교수가 가르치는 과목은 ? (What is the course name that a professor whose major is NLP teaches?) --------------------------- Semantic Result -------------------------- Scope: [0, 17] [ques, [contra, term(,X7, [and, [course,X7], [teach,EV3, term(,X6, [and, [professor,X6], [major,EV1,X6, term(,X1, [NLP,X1])]]),X7,\_:p[j 에 ];0F]])]]

19 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 19 Semantic analysis: system overview Morphological Analyzer QLF StructuresSyntactic Trees Semantic AnalyzerPOS Tagger Semantic Dictionaries (base/dom/pat/user/rel) Thesauri Semantic-based Applications Slot-Filler Generator Input Sentence K-CCG Parser Topic/Subject Extractor...…

20 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 20 NLP technology: Korean WordNet Map Korean words to other existing thesaurus (WordNet)  Using bi-lingual dictionary  Automatic mapping tools using WSD techniques Korean wordEnglish wordWordNet synset kw i_j ew 1 ws 1 ew m … ws 2 ws k ws n … … …

21 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 21 NLP technology: Korean WordNet Multiple heuristics for WSD  Maximum similarity  Prior probability  Sense ordering  IS-A relation  Word match  Cooccurrence Combining heuristics with machine leaning techniques  Decision tree  Logistic regression

22 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 22 ContentsContents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions

23 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 23 Several gaps in the search Search engine task Info need Verbal form query results Query refinement web Mis-conception Mis-conception Mis-translation Mis-formulation Polysemy/synonymy interactive QA (askjeeve) Nlp query (easyask, lexiquest) Queries in context (domain) (autonomy, verity) Clustering (northernlight)

24 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 24 Web search vs classical IR Classical IR  Fixed document corpus  Document relevancy is the goal  Contexts (domain) and individual users (preferences) ignored Web search  Public web: static + dynamic (generated from RDB)  High quality ranking is the goal (meet the user need given poor query and heterogeneity of the web)  Various needs such as informational, navigational, transactional

25 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 25 Search engine techniques First generation  TF/IDF from standard IR  Use only page data (text data)  Html parsing for weighting Second generation  Use off-page and web specific data  Such as link (connectivity) analysis, click-through data (relevance feedback), anchor-text data Third generation  Answer the need behind the query  Semantic analysis, context determination, dynamic corpus from RDB, validity (authority), cross-lingual/cross-media, question-answering, specific enterprise site search, etc

26 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 26 ⓝ constructing high-quality corpus: Web ROBOT Web-page Filtering URL Domain Filtering File Type Filtering URL Name Filtering 1 st trying Target Web Document Filtered Target URL File info. (Date, Size, Link) File Collection & Management manage independent site saved by URL Hierarchy make Log Files Domain Site 1 Smart Updating save new Web-page overwrite updated-page Saved File Collected File. Result File Pool WEBtagger Result File Manager WEBtagger Result File Manager Domain Site 2 Updating same Web Document Various User-Input Option : Filtering Constraints

27 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 27 High quality corups: WEB preprocessing Result File Pool Entry Dic, Pattern Dic Entry Dic, Pattern Dic Postag Noun Trie Dic. Postag Noun Trie Dic. C4.5 Rule C4.5 Rule HTML Refiner Sentence Extractor Word Spacing Corrector Result File Manager nROBOT (Web Crawler) Regexp Patterns Rule Regexp Patterns Rule Heuristic Rule Heuristic Rule Abbreviation Dic Abbreviation Dic Symbol-Delimiter DB Symbol-Delimiter DB C4.5 Rule C4.5 Rule Tag Corrector & Parser Tag Corrector & Parser Garbage String Filter Garbage String Filter Web document Input POSTAG TTS System SAA XML DOC. Form A Form B POSNIR XML DOC. Form D Form C

28 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 28 High quality corpus: Automatic indexing Morpheme Analysis POS Tagging Term Extraction Term weighting Documents index DB Indexing architecture  Based on general morpheme tagging  Term Extraction à nominals - single terms à compound noun generation – using rules automatically learned – filtering through precision (preventing over- generation) à compound noun segmentation – based on mutual information  Term weighting à for document ranking à based on TF, IDF measures

29 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 29 Compound nouns in indexing POSNIR features  Noun extraction ex) 철수는 회의에서 그 사건을 보고할지도 모른다. (Chulsoo may report the accident at the meeting)  bogo(O) ex) 지도를 보고 길을 찾는다. (see a map and find a load)  bogo(X)  Compound noun segmentation à Compound noun patterns plus statistical collocation (mutual information) ex) 대학생선교회 (undergraduate missionary)  대학생 / 선교회 (O), 대학 (university)/ 생선 (fish)/ 교회 (church) (X)  Compound noun indexing (phrasal indexing) à Using automatically acquired extraction rules à Broad coverage of compound noun pattern recognition ex) 증기로 움직이는 기관차 (locomotive operating by steam)  증기 (steam)/ 기관차 (locomotive)

30 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 30 Dealing with user queries: NL query NLP Engine Morpheme Analysis Tagging Query Term Extraction and Boolean Formulation DB Boolean Operation and Ranking DB Search NL Query Searh Result Tagged Sequence

31 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 31 Humans extracts meaning in many linguistic levels but current web search is only counting words – Is it enough? Part of words – morpheme Word order Word lexicals Text structure or document structure Clue words/cue phrase Pronunciation/prosody World knowledge

32 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 32 NLP helps high-precision web search Information retrieval dilemma  Hard to ask right questions  Too much information  Irrelevant information  No information (phrase mismatch) NLP tools to help avoiding information dilemma  Context of words: collocations  Syntax cues:how word is used  Concept mapping with clustering  Interactivity by clarifying dialog

33 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 33 Other Related Technology IR Application XML & KM Machine Learning Q/A Application Domain Ontology Mgmt Tool Contents Auto-Builder Text Preprocessor Intelligent Web Robot Text Summarizer Text Categorizer Document Categorization Similar Text Clustering Information Extraction Wrapper Induction K-Wordnet Auto-Builder Answer Suggester Multi-Lingual IR Engine Fuzzy-SQL Generator Shopping Aid Agent Solution NL-Query Analyzer FAQ Finder Solution Korean NLP Core Engine POS-Tagging Syntactic Analysis Semantic-Discourse Analysis DBQ/A Solution Comp-Noun Analyzer

34 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 34 ContentsContents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions

35 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 35 The third generation search engine Natural language question-answering : Answer-providing for dialog questions Answer sentence extraction (DiQuest d -Answer)  Pre-defined question types  Semantic-level processing of NL query Answer finding from FAQ (DiQuest e -Answer)  Systematic construction of FAQ  Finding semantically same questions from FAQ list  Email/Web call center applications Answer finding from R-Database (DiQuest db -Answer)  Finding answers from R-DB attributes  SQL conversion from natural language query Companies  Neuromedia, Answerfriend, Answers.com, Brightware, Answerlogic, Easyask, etc.

36 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 36 Easy Interface with Natural Dialogues DiQuest Q/A : Total dialog information retrieval solutions  Easy and accurate information retrieval using natural language dialog  Retrieval from any information source including internet/intranet web documents, FAQ knowledge, databases DiQuest Q/A Solution DiQuest d-Answer DiQuest e-Answer DiQuest db-Answer Other NLP Applications

37 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 37 Why Dialog Web Interface?   Efficiency: no need for web surfing   Accuracy: exact description of search   Convenience: using everyday dialog sentences   Customer satisfaction guaranteed!   Easy to catch customers’ needs in natural language query (Not easy to catch customers’ needs using only keywords query)   Customer-oriented Web content management   Customer–oriented FAQ K/B construction and maintenance   Personal profile management for each customer (CRM) Customer-Side Company-Side

38 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 38 Spectrum of Products CRM/KM/E-commerceInformation retrievalDocument processingLanguage processing shopping mall retrieval Wireless question answering FAQ finding NL-SQL conversion Answer indexing Complex term indexing MorphologySyntax Email/web call center Intranet question answering Vertical IR Answer sentence extraction Question type processing Structure indexing Semantics Dialogs Service Package Component Library

39 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 39 Branded Products Brand properties performance applications dAnswer eAnswer dbAnwer Vertical retrieval High speed indexing Answer sentence extraction Optimized retrieval Vertical retrieval High speed indexing Answer sentence extraction Optimized retrieval Answer finding from FAQ knowledge base Real time FAQ construction/indexing Possible fusion with d-Answer/ db-Answer Answer finding from FAQ knowledge base Real time FAQ construction/indexing Possible fusion with d-Answer/ db-Answer SQL feature computation Automatic vocabulary construction Optimized for given RDB schema SQL feature computation Automatic vocabulary construction Optimized for given RDB schema 0.1 million doc. answer sentence extraction (about 1sec response) 1 million doc vertical IR 1 million doc vertical IR (about 0.3 sec) (about 0.3 sec) platform: Linux, Solaris, HPUX 0.1 million doc. answer sentence extraction (about 1sec response) 1 million doc vertical IR 1 million doc vertical IR (about 0.3 sec) (about 0.3 sec) platform: Linux, Solaris, HPUX Over 10,000 FAQ doc. Over 10,000 FAQ doc. (about 0.3 sec response) (about 0.3 sec response) More than 1000 simultaneous access platform: Linux, Solaris Over 10,000 FAQ doc. Over 10,000 FAQ doc. (about 0.3 sec response) (about 0.3 sec response) More than 1000 simultaneous access platform: Linux, Solaris 100% retrieval accuracy100% retrieval accuracy Over 100,000 records (0.3 sec response) platform: Linux, Solaris 100% retrieval accuracy100% retrieval accuracy Over 100,000 records (0.3 sec response) platform: Linux, Solaris Document search for KM/internet/portals Answer finding for KM/intranet High precision search for wireless application Document search for KM/internet/portals Answer finding for KM/intranet High precision search for wireless application Email call center Web call center Automatic FAQ knowledge base construction CRM analysis Email call center Web call center Automatic FAQ knowledge base construction CRM analysis Product search for e- commerce (B2B, B2C, B2G) Employ portal/business portal Intranet/KM DB search Product search for e- commerce (B2B, B2C, B2G) Employ portal/business portal Intranet/KM DB search competitors Verity Askjeeves Verity Askjeeves Brightware Egains Brightware Egains Easyask ELF/Microsoft Easyask ELF/Microsoft

40 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 40 DiQuest d -Answer : Vertical IR agent with answer extraction High precision optimizable IR engine  Horizontal IR limitations : focusing high speed indexing, sacrificing high precision  Why Vertical IR?  User intention analysis using language processing  Optimization possible for specific domain/portal Intelligent IR engine for answer sentence extraction  Conventional natural language IR (e.g. askjeeves) limitations  Only provide documents which possibly include query terms  It is the USER who needs to find exact information in the documents  Why Q/A System ?  Provide direct answers (information) rather than thousand of documents  Towards true meaning of information retrieval (next generation IR)

41 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 41 DiQuest d-Answer Result File & Answer “ R관 ” “ 공학관 7층 ” Query Analysis DiQuest d -Answer DiQuest d -Answer Merits  Spectrum of solutions from high precision IR to intelligent question-answering system with natural language dialog query  Web site question answering engine: extract sentences that contain possible answers as well as documents for users’ questions DiQuest d -Answer : Question Example “ 삼성그룹 회장은? ” (Who is the chairman of Samsung group?) “ 야후코리아의 홈페이지 주소와 김경희 팀장의 이메일은? ” “ 야후코리아의 사장은 누구인가 ” “ 윈도우 미의 가격? ” (What is the price of Windows ME?) “ 물건 반납에 관한 것을 상담하려면 어디에 전화해야 하나요? ” “ 화공과는 어디에 있나요 ” (Where is the CE dept.?)

42 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 42 DiQuest d -Answer Preview Answer Suggestions

43 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 43 DiQuest SiteQ – Natural Language Answer Extraction System Architecture

44 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 44 DiQuest e -Answer FAQ finding engine  FAQ : frequently asked question knowledge-base (question/answer pairs)  80% of user questions can be processed using well constructed FAQ lists  Automatically finding optimized answers from FAQ lists  Reducing email/phone calls using automatic FAQ finding solutions (customer satisfaction increased) Finding semantically same questions from FAQ knowledge-base   Exact pin-pointing of users’ question intentions   Structural analysis of sentences for finding same-meaning questions   Highly precise retrieval using specialized analysis for question and answer parts in faq KB   Conventional keyword IR techniques cannot retrieve semantically same questions !! Intelligent answering agent with FAQ

45 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 45 DiQuest e -Answer Preview (1) Answer Suggestion

46 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 46 DiQuest e -Answer Preview (2) e -Answer combined with d -Answer Answer Suggestion

47 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 47 System Architecture DiQuest FAQ Finder

48 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 48 DiQuest DB -Answer Database search engine  information retrieval in the relational database (using SQL computation)  automatic term indexing by analyzing running database Translate users natural language questions into standard SQL for relational database computing   Recursive natural language query (automatic query refinement)   Fusion solutions with e-Answer and d-Answer   Integrated search for product description texts with product database   Integrated search for web documents with highly variable data in structured database Intelligent RDB search Engine

49 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 49 DiQuest DB -Answer Preview (1) 자연어 질의 분석 후 SQL 생성

50 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 50 DiQuest DB -Answer Preview (2) 이전 결과에 대한 담화 (Discourse) 유지

51 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 51 DiQuest DBQ – Natural Language SQL Interface System Architecture

52 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 52 Web Total Q/A System Architecture

53 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 53 E-commerce applications SAA (Shopping Aid Agent) – web mining back-end solution Web robots à Category specific web crawling (remove duplicates) Categorizer à Categorize the web documents into the pre-defined domain classes Extractor à Web information extraction to build R-db à extraction using mDTD (modified Document Type Definition) à Sequential mDTD learning to generate new mDTD rules Natural Language query to automatically constructed RDB Comparison-based shopping, automatic job search, continued- education Whizbanglabs.com (from CMU)

54 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 54 E-commerce SAA SGML Documents DTD DocType Definition Analysis & Encoding Training Documents (structured HTML) mDTD LearningExtraction Web Documents (structured and semi-structured Documents) Basic Idea

55 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 55 E-commerce SAA Example Extraction mDTD Parsing Learned mDTD Structured Documents Seed mDTDs Sequential Learning Domain DB for AV Extraction Slot Filling DB building Sequential mDTD Learner Extractor Web Robot Seed URLs HTML Gathering Categorizer mDTD Parsing Learned mDTD Structured /Semi-structured Documents HTML Documents knn Bi-categorizing Domain Documents

56 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 56 ContentsContents Commercial e-solutions: search, QA, CRM Natural Language Processing Technology Information Retrieval Technology Intelligent QA solutions Conclusions

57 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 57 NLP QA/vertical search applications Internet/intranet vertical retrieval eCRM/web-based CRM (automated call center) Comparison based e-shopping mall/meta mall WAP enabled PDA/cell phone retrieval KMS embedded solutions Voice enabled retrieval/ voice portal retrieval

58 DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment 58 Future perspectives Long term future  Apple’s bow tied man -- new millenium dream  SF films -- “angel” in “disclosure” movie  HAL in space odyssey 2001 (forever dream?) Short term future  General magic’s portico system (http://www.genmagic.com/portico/portico_home.shtml)  Microsoft persona project -- peedy (http://msdn.microsoft.com/workshop/c- frame.htm#/workshop/imedia/agent/default.asp)  Diquest.com – total QA solution (www.diquest.com demo)


Download ppt "DiQuest.com Intelligent Dialog Interface Solution for friendly User Interactions in Internet WEB Environment March, 2001 Next generation web search and."

Similar presentations


Ads by Google