How to Use LucidWorks Search

Slides:



Advertisements
Similar presentations
Mercury Quality Center 9.0 Training Material
Advertisements

News Archive Screen Shots. News Archive Screen Shots.
Web Page Training Summer 2014 Presented by: Mountain Brook Schools Tech Team.
Welcome to EDINA Digimap Digimap is an EDINA service offering online access to a range of spatial data. It is authenticated using Athens and is available.
Web FOCUS Integration with Microsoft Office SharePoint By: Kelvin Ruiz NASA – Kennedy Space Center.
Created By: Jeremy Callan Descore Inc ext. 5405
Sharepoint Portal Server Basics. Introduction Sharepoint server belongs to Microsoft family of servers Integrated suite of server capabilities Hosted.
Implementing search with free software An introduction to Solr By Mick England.
Set up your own server! Go to “Pricing” and select the $25 package.
Search Search Drupal with Apache Solr with CERN Web Communications Group – Copyright 2013.
E-Commerce LAB#1 Samia alblwi1E-Commerce ( IS412) 2011.
Nutch Search Engine Tool. Nutch overview A full-fledged web search engine Functionalities of Nutch  Internet and Intranet crawling  Parsing different.
Branded Websites. Branded Website Training Click the “Edit Pencil” to edit the website Enter in your iBoomerang username and password.
MAE Atlassian Tool Suite Administration Training July 8 th, 2013.
+ Working in Your CCE Online Course Site. + Structure of CCE Online Course Sites CCE online courses use the document sharing and collaboration features.
After signing up for the Shared Cpanel Hosting at CaCloud.com ( you will thenhttp://
MIP Workbench: Revisions FEMA Learning Management System PROJECT ADMINISTRATION ROLE.
Edit a Page Detailed Front End To edit any information on your web page, you will have to login to the admin tool to change it.
Nutch in a Nutshell (part I) Presented by Liew Guo Min Zhao Jin.
Site Registration and Monitoring
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
EBSCOhost Databases Access to bibliographic and/or full-text resources from more than 30 online databases such ATLASerials, Academic Search Premier, Education.
IST 441 Example Projects. Undergrad Project Find a customer – interest in xbox game forum Build a search engine for Xbox game forums etc. Compare two.
INTRODUCTION TO WORDPRESS. About WordPress The free service that we will use from WordPress is often used as blogging software – very little knowledge.
Introduction to Nutch CSCI 572: Information Retrieval and Search Engines Summer 2010.
PubMed/Advanced Search: Using Limits (module 4.2).
Information Management System “ Expert Profile Module" Information Management System “ Expert Profile Module" The Expert Profile module is an integrated.
EARN-NETWORK.ORG. Login Search Articles Join EaRN Donate.
Module 10 Administering and Configuring SharePoint Search.
The New GIL Web Site Overview for Editors Phil Williams GIL Support UGA GUGM 2011 Macon State College 19 May 2011.
Graphing and statistics with Cacti AfNOG 11, Kigali/Rwanda.
Using As series of training presentations How to edit an existing project September,
0 SharePoint Search 2013 Rafael de la Cruz SharePoint Developer Seneca Resources twitter.com/delacruz_rafael
Information Management System “Good Practice Module" Information Management System “Good Practice Module" The Good Practice / Success Stories module is.
Table of Contents TopicSlide Administrator Login 2 Administrator Navigations 3 Managing AlternativeDr.com Blogs 4 Managing Dr. Lloyd May Blogs 5 Managing.
Teach Me How to Diigo! Using Diigo to Create Bookmarking Groups and Share Favorite Websites By Shauna Ryan.
Welcome to Minnesota’s eFolio St. Cloud Technical College June 2, 2003 Norman Baer Matt St. Martin.
Information Management System “Institutions Module" Information Management System “Institutions Module" The System management module is an integrated part.
Pairus Admin Admin Panel Changes Required 1. Contents - Changes  Pairus Admin – Site Address Pairus Admin – Site Address  Fix logo at login screen –
Information Management System “Project Module" Information Management System “Project Module" The Project module is an integrated part of System. The back.
Self Service Student - Parent. Admin Student - Parent Employee.
Working with the interface and interacting with the iPad app.
STEP S  Follow the steps that you see in this section of the slide. Medicaid Electronic Health Record (EHR) Incentive Program Getting Started: Login Go.
ELISQ Systems Demonstration Sagnik Ray Choudhury Doha -- May 2015.
Apache Solr Dima Ionut Daniel. Contents What is Apache Solr? Architecture Features Core Solr Concepts Configuration Conclusions Bibliography.
NX Documentation Using Windows IIS (Internet Information Services) as a http server for NX documentation.
Installing and Configuring Moodle. Download Download latest Windows Install package from Moodle.orgMoodle.org.
How to use Drupal Awdhesh Kumar (Team Leader) Presentation Topic.
Kamdhenu Website is used to Add agent in Kamdhenu offer under Group head Pfiger Software Technologies Pvt. Ltd.
How to Install OpenLDAP Sudo apt-get update Sudo apt-get install slapd ldap-utils.
Justin Scheitlin Daisey Fahringer
Website CMS Manual Admin Panel Login URL : Username: admin Password:
Introduction to YouSeer
Unit 7 Learning Objectives
IST 516 Fall 2010 Dongwon Lee, Ph.D. Wonhong Nam, Ph.D.
Online password manager By: Anthony diveronica
4Schools Adding a Web Page.
Adobe Analytics Journal Ad Sales
Setting Up Chatter on Mobile devices
Presentation of the eTendersNI service Business Intelligence Module
Guided By: WpglobalsupportWpglobalsupport WordPress: Adding the WordPress Logout Link to Navigation Menu Guided by: wpglobalsupportwpglobalsupport.
OUTLINE Basic ideas of traditional retrieval systems
Crawling with Heritrix
People Admin Annual Performance Review
Adding your VUMC account to the Outlook App
Anatomy of a Search Search The Index:
OUCampus Content Editor Training
Getting Started With Solr
Adding members to ArcGIS Online
FCL – Administration Tool
Presentation transcript:

How to Use LucidWorks Search Sagnik Ray Choudhury Sagnik@psu.edu

Installation and Search Components Access control. Crawling Aperture crawler. Web, filesystem, amazon S3 bucket Information extraction: Aperture parser Indexing Lucene. Ranking Result interface Standard/Flair interface lucidworks IST 441 PSU

Start Page http://ist441.ist.psu.edu:8988 lucidworks IST 441 PSU

Access Control: Admin Panel Admin screen: login here (username admin, password admin) lucidworks IST 441 PSU

Admin Dashboard User control Collections lucidworks IST 441 PSU

Adding Users If you use local installation: May or may not create users. If you use server installation: Create a new user with admin privilege. Delete the admin account. Do not use PSU/IST credentials. Creating new user Deleting admin lucidworks IST 441 PSU

Crawling: Step 1 Add a new collection with default template. lucidworks IST 441 PSU

Crawling: Choosing a Data Source Click on the new collection. Note index size and number of documents. Add a new data source (web site) lucidworks IST 441 PSU

Crawling: Parameter Selection Name, url, crawl depth Constraint to Allow crawling within the site/ outside the site. Include paths Particular set of pages you wish to crawl. Exclude paths Filetypes/ pages you do not Want to crawl. Small scale single thread crawler, for better performance, nutch can be integrated. http://docs.lucidworks.com/display/help/Create+a+New+Web+Site+Data+Source lucidworks IST 441 PSU

Starting the Crawling Process Click create to move to crawl-job screen. Start crawling (you can add a schedule too to crawl periodically). You can add another website by going back to collection page (slide 8). lucidworks IST 441 PSU

Information Extraction and Indexing Information extraction from crawled web pages. Default: Aperture parser. Fallback: Apache Tika. Extracted information: author, fulltext, date etc. http://docs.lucidworks.com/display/lweug/Overview+of+Crawling (field mapping section) Information extraction and indexing runs simultaneously with the crawling. Need to do a “hard commit” to ensure that index is up to date. To know more about the index, go to the Solr page for the collection. lucidworks IST 441 PSU

Searching Default interface: click on “tools” link on the top panel. lucidworks IST 441 PSU

Searching: Flare interface The “Apps” page links to the starting point for Flare interface. For advanced searching and statistics, click on your collection. lucidworks IST 441 PSU

Conclusion Basic crawling, indexing and searching using LucidWorks. Simple to use, but do not offer much flexibilities. Things to try: Incorporating new crawlers. Changing the information extraction process. Changing the indexing schema and ranking functions. Questions? lucidworks IST 441 PSU