Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher.

Slides:



Advertisements
Similar presentations
Luna imaging, inc hayden ave bldg. one culver city, ca ph fax Insight User Group Meeting.
Advertisements

VuFind Beyond MARC discovering everything else Demian Katz VuFind Developer
12 October 2011 Andrew Brown IMu Technology EMu Global Users Group 12 October 2011 IMu Technology.
© Copyright 2012 STI INNSBRUCK Apache Lucene Ioan Toma based on slides from Aaron Bannert
AHRT: The Automated Human Resources Tool BY Roi Ceren Muthukumaran Chandrasekaran.
Web Applications Development Using Coldbox Platform Eddie Johnston.
Lucene Part3‏. Lucene High Level Infrastructure When you look at building your search solution, you often find that the process is split into two main.
For ITCS 6265 Professor: Wensheng Wu Present by TA: Xu Fei.
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
DEV392: Extending SharePoint Products And Technologies Through Web Parts And ASP.NET Clint Covington, Program Manager Data And Developer Services - Office.
28/1/2001 Seminar in Databases in the Internet Environment Introduction to J ava S erver P ages technology by Naomi Chen.
.NET Framework V3.5+ & RESTful web services Mike Taulty Developer & Platform Group Microsoft Ltd
©Silberschatz, Korth and Sudarshan1.1Database System Concepts Chapter 1: Introduction Purpose of Database Systems View of Data Data Models Data Definition.
SQL Reporting Services Overview SSRS includes all the development and management pieces necessary to publish end user reports in  HTML  PDF 
Understanding and Managing WebSphere V5
Implementing search with free software An introduction to Solr By Mick England.
Lecture 3 – Data Storage with XML+AJAX and MySQL+socket.io
Introduction to the Enterprise Library. Sounds familiar? Writing a component to encapsulate data access Building a component that allows you to log errors.
GIS technologies and Web Mapping Services
Battle of the Giants Apache Solr 4.0 vs ElasticSearch 0.20 Rafał Kuć – sematext.com.
Configuration Management and Server Administration Mohan Bang Endeca Server.
Building Search Portals With SP2013 Search. 2 SharePoint 2013 Search  Introduction  Changes in the Architecture  Result Sources  Query Rules/Result.
Data File Access API : Under the Hood Simon Horwith CTO Etrilogy Ltd.
Copyright © Orbeon, Inc. All rights reserved. Erik Bruchez Applications of XML Pipelines XML Prague, June 16 th, 2007.
LiveCycle Data Services Introduction Part 2. Part 2? This is the second in our series on LiveCycle Data Services. If you missed our first presentation,
Revolutionizing enterprise web development Searching with Solr.
Overview of IU Digital Collections Search Hui Zhang Jon Dunn Indiana University Digital Library Program IU Digital Library Brown Bag October 19, 2011.
1 ® Copyright 2009 Adobe Systems Incorporated. All rights reserved. Adobe confidential. 1 Building Portlets with ColdFusion Pete Freitag Foundeo, Inc.
Website Development with PHP and MySQL Saving Data.
Searching Business Data with MOSS 2007 Enterprise Search Presenter: Corey Roth Enterprise Consultant Stonebridge Blog:
March 12 & 13, 2007 IIS 7.0 for CFML Developers Deploying on IIS 7.0 with Adobe ColdFusion and New.
MAKANI ANDROID APPLICATION Prepared by: Asma’ Hamayel Alaa Shaheen.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
What's New in Kinetic Calendar 2.0 Jack Boespflug Kinetic Data.
Iccha Sethi Serdar Aslan Team 1 Virginia Tech Information Storage and Retrieval CS 5604 Instructor: Dr. Edward Fox 10/11/2010.
A radiologist analyzes an X-ray image, and writes his observations on papers  Image Tagging improves the quality, consistency.  Usefulness of the data.
Server-side Programming The combination of –HTML –JavaScript –DOM is sometimes referred to as Dynamic HTML (DHTML) Web pages that include scripting are.
Experiment Management System CSE 423 Aaron Kloc Jordan Harstad Robert Sorensen Robert Trevino Nicolas Tjioe Status Report Presentation Industry Mentor:
Facebook API Kelly Orser. Client Libraries Client libraries will simplify the calls to the platform by reducing the amount of code you have to write.
1 Java Servlets l Servlets : programs that run within the context of a server, analogous to applets that run within the context of a browser. l Used to.
Web Technologies Lecture 8 Server side web. Client Side vs. Server Side Web Client-side code executes on the end-user's computer, usually within a web.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
807 - TEXT ANALYTICS Massimo Poesio Lab 2: (Quick intro to) SOLR Document clustering with MAHOUT.
RESTful Web Services What is RESTful?
WEB SERVER SOFTWARE FEATURE SETS
Web Technologies Lecture 10 Web services. From W3C – A software system designed to support interoperable machine-to-machine interaction over a network.
Herzog August Bibliothek Wolfenbüttel Backend, Service, Listener VuFind's new SOLR connection Originally Presented By David Maus Herzog August Bibliothek.
Presented By:. What is JavaHelp: Most software developers do not look forward to spending time documenting and explaining their product. JavaSoft has.
: Information Retrieval อาจารย์ ธีภากรณ์ นฤมาณนลิณี
Session 11: Cookies, Sessions ans Security iNET Academy Open Source Web Development.
Apache Solr Dima Ionut Daniel. Contents What is Apache Solr? Architecture Features Core Solr Concepts Configuration Conclusions Bibliography.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Configuring MQ Connections and Handlers for MQ adapter 6.5 July 2008.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
ArcGIS for Server Security: Advanced
IST 220 – Intro to Databases
Self Healing and Dynamic Construction Framework:
Searching and Indexing
Open Source distributed document DB for an enterprise
Safe by default, optimized for efficiency
Knowledge Byte In this section, you will learn about:
PDAP Query Language International Planetary Data Alliance
CS 5604 Information Storage and Retrieval
CS6604 Digital Libraries IDEAL Webpages Presented by
Ashutosh Rana Rahul Nori 7/17/2018
Lucene/Solr Architecture
Getting Started With Solr
Rafał Kuć – Sematext sematext.com
Battle of the Giants Apache Solr 4.0 vs ElasticSearch 0.20
.NET Framework V3.5+ & RESTful web services
Presentation transcript:

Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

What is Solr? Solr offers advanced, optimized, scalable searching capabilities Communicate with Solr using XML, JSON and HTTP Includes a HTML admin interface Solr is built on top of Lucene Rich features of Lucene can be leveraged when using Solr Solr is very configurable

Integration with ColdFusion Very little direct integration with ColdFusion ColdFusion communicates with Solr using HTTP Solr runs in its own JVM, does not share with ColdFusion Using ColdFusion installation, Solr runs in a jetty servlet container on port 8983 ( Solr is exposed in production by default Important files located C:\ColdFusion9\solr\multicoreC:\ColdFusion9\solr\multicore Solr offers a lot more than what is available using cfindex cfcollection cfsearch

Solr What is a core? – it’s like a verity collection (a searchable data group) Single Core (one index) vs Multicore (multiple isolated configurations/schemas/indexes using same Solr instance) C:\ColdFusion9\solr\multicore\solr.xml is the central file that points to locations of the Solr cores’ configuration and data (this what CF administrator reads/writes to when creating and using Solr collections) You can put your Solr cores under you project directory and keep them in source control

[core]/conf/solrconfig.xml Main configuration for solr core determines the format of the results. ColdFusion uses xslt by default You can return JSON, XML, python, ruby, php Multiple query response writers can be configured, one can be set as default others can be specified by passing parameter wt:[name] (eg. wt:json) cfsearch type of methods will not work if the response writer is not what ColdFusion is expecting

[core]/conf/schema.xml Field Types maps custom types to the solr/lucene type type solr.TextField allows for analyzers Analyzers can be run at index time or query time They allow for manipulations of the data (typically filtering) The order in which filters are declared is the order processed StopFilterFactory removes common words that do not help the search results WordDelimiterFilterFactory can adds words like WiFi, Wi, Fi by splitting the original into subwords

[core]/conf/schema.xml cont. EnglishPorterFilterFactory determines root word using word variations like -ing determines root word and adds to index SynonymFilterFactory treats words as same DoubleMetaphoneFilterFactory for phonetic logic (better than Soundex which Verity uses) TextSpell/TextSpellPhrase feedback “did you mean” dest fieldtype can run different analyzers on source field and store result wiki.apache.org/solr/AnalyzersTokenizersTokenFilters Adobe adds quite a bit to the file to create fieldtypes to be compatible with what was in verity

[core]/conf/schema.xml cont. Similar to creating a database table. Maps field names to types using Gives you the ability to store additional data Field can be indexed (searchable) Field can be stored (referenced and returned with results) Field can be required [field name]

Indexing Data is sent using api - HTTP POST to Solr as XML/JSON/Binary Commit is an intensive task. Do bulk adds first then call commit calls commit after each index (confirmed?) Commit after each would noticeably increase index time Efficient Process : add data (queue), commit, optimize

Search Syntax field:term (*:* returns everything) A score is generated at query time, the value itself doesn’t have any meaning, the scores are relevant only when relative to each other (a scale) fq can filter query based on some supplied condition wt is the return type of the results (xml,json, etc.) qt is the request handler used to process the request (default is “standard”) fl is the list of fields to return (field must be stored) q is the query string You can specify the start value and maxrows

DisMaxRequestHandler Declared in solrconfig.xml Allows simplified searching without strict syntax Can be configured with default weighted parameters (which can be overriden) Causes the q parameters to be parsed differently

Resources Lucene In Action CF Solr Lib written by Shannon Hicks – Wrapper for Solr functionality