Full text indexing of multi character PDF documents as ADAM digital objects. V18 RC 2089 This presentation applies to Version 18 and up Presenter: Yoel.

Slides:



Advertisements
Similar presentations
Mercury Quality Center 9.0 Training Material
Advertisements

Citavi – Adding References – Articles from EBSCOhost Databases
10. NLTS2 Documentation Overview. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training Modules.
The front door of the OACIS site includes: 1.General information 2.Funding information – active links concerning TICFIA 3.Contact links 4.Quick links –
Research Instruction for Off Campus/Internet Students.
PDF Dissertation Full Text Book Promotion & Service Co., Ltd. ByByByBy Jirawat Promporn Jirawat Promporn k.co.th
QUT Library EndNote : Managing images. Adding images to EndNote records With EndNote Version 7, images may be embedded within records The Figure, Chart.
How to publish to PRIMO with a URL to the ADAM digital object Yoel Kortick.
How to make a custom service run the X service for renew on a group of patrons Yoel Kortick.
V. 21. new fix_doc_notes fix routine (downgraded to version 20 RC 3513) Yoel Kortick Aleph Support Manager.
How to sort the “Order Information report” from the service “Print Acquisitions Records acq-03” Yoel Kortick.
How to Open a New Language in Web OPAC for Testing Web Screens Presenter: Yoel Kortick.
How to use the SDI RSS Feed Version 18 Yoel Kortick.
1 Audible Alert for Return Item on Hold rep_ver in V20 Yoel Kortick Aleph Support Manager.
Examples of UTF compliance in version 20.1 Yoel Kortick Aleph support manager.
V. 21. Automatic LKR field creation from item. Rep_ver and Yoel Kortick Aleph Support Manager.
ALEPH500 Documentation. Documentation Seminar March 2001.
Integrated ISO ILL for staff users Borrowing requests – part one Yoel Kortick 2007.
V. 21. Controlling and limiting the creation of photocopy and hold requests according to “service hours”. Rep_Ver Yoel Kortick.
Display of combined characters in Aleph GUI Yoel Kortick May 2010 Version 20.2 Minor Release Development A rep_change #
How to Combine Items of Separate Bibliographic Records Version 16 and up Yoel Kortick.
How to create and use authority records Version 16 and up Yoel Kortick.
How to add additional privileges to user_function.lng (originally SI ) Presenter: Yoel Kortick.
How to control bracket and parentheses appearance in right to left display of web Presenter: Yoel Kortick.
1 Controlling directionality with Unicode Characters Yoel Kortick Aleph Support manager.
The LKR field in Cataloging Version 16 and up Yoel Kortick.
Performing ISO ILL borrowing and lending requests on the same server Yoel Kortick 2008.
1 Yoel Kortick Senior Librarian Adding a local Electronic Collection.
How to send Serial claims to vendor (Batch) Version 16 Yoel Kortick.
Setting up and using Acquisitions-related indexes and logical bases. Version 18 and up Yoel Kortick.
Aleph Publishing services with a special focus on PRIMO-FULL and PRIMO-AVAIL version 18 Presenter: Yoel Kortick.
Indexing and filing the “thorn” character as “th” Yoel Kortick Jan
Using home made fix procedures for non Latin characters Yoel Kortick.
Validation Check Version 21 rep_ver # Yoel Kortick.
SMS Messaging in Aleph 500 Version 20 and up
Refworks Part I.
How to get started with RefWorks
Rep change 1590 (ver 18) Access to Google books
How to make a bibliographic base of records with attached ADAM digital objects Yoel Kortick October, 2007.
Version 20 Feature Developments
How to add a fix procedure for copy cataloging from a Z39.50 base
How to configure and use title hold requests version 18
Digital Assets Module Services adam-01 and adam-02 versions 17 and up
Check_circ_14_a for LOAN RC 2042 in V20
Receiving New Lending requests
Heading maintenance via the GUI
How to Define Separate Order Counters for Separate Sub-Libraries
User Guide PrimePortal – File Archive
System Administration Management
How to change default item status after binding version 16 and up
Yoel Kortick Senior Librarian
Publishing Course Readings to PRIMO Version 20
Presenter: Yoel Kortick
Building bases according to sub library
ReturnLoadPatron RC 2024 in V20
Introduction This presentation will show two ways of making a report of newly acquired records: Via services Retrieve Catalog Records (ret-01) and Print.
How to get started with RefWorks
How to Add Pinyin Text to a Record with CJK Version 18 RC 1807 Version 19 RC 409 Yoel Kortick.
Integrated ILL GUI desktop
Parallel_words in versions 18, 19 and 20
Multi item ILL requests in Aleph 500 Version 21 (rep ver and 17844)
User Guide PrimePortal – File Archive
Yoel Kortick Senior Librarian
NeVA National e-Vidhan Application
Introduction to the ISB Intranet
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
YOUR text YOUR text YOUR text YOUR text
Contributing to and Contributed By in the Alma Community Zone.pptx
The LKR field in Cataloging Version 16 and up
New “replace string” parameter for fix_doc_do_file_08
Presentation transcript:

Full text indexing of multi character PDF documents as ADAM digital objects. V18 RC 2089 This presentation applies to Version 18 and up Presenter: Yoel Kortick

2 Introduction This presentation will show how multi character PDF documents may be indexed with “full text indexing” in ADAM We will show here PDF documents, but the same workflow and functionality applies also to Word documents The functionality presented here is available from version 18 rep change 2089 and up

3 Introduction This is not a general presentation explaining how to use ADAM and full text indexing, rather it explains one very specific area and how it relates to rep change 2089 For more about using ADAM and full text indexing see the: 1.directory “ADAM” on the Doc Portal under: Aleph > Tree Search > How to from support 2.The Aleph user guide

4 rep_change 2089 Description: Full text indexing - when the file to index was of type "pdf" and contained non-ascii UTF-8 characters, the indexing sometimes failed. This has been corrected. Module: INDEXING Change Type (T[ech]/D[ev]/B[ug]): B SI number: , , , , Unix files:./alephm/source/butil/b_manage_91_a_f.cbl./alephm/source/check_record/check_z403.cbl./usm50/tab/pc_tab_exp_field.eng

5 rep_change 2089 Implementation Notes: In order to index a file of type pdf containing characters which are not in any range of ISO8859, it is recommended to define the VIEW file as UTF-8. To do this, add in [adm_library]/tab/pc_tab_exp_field.lng the following line after the lines of type OBJECT-CHAR-SET: ! !!!!!!!!!!!!!!!!!!!!-----!-!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!-!!!!!!!!!!!!> OBJECT-CHAR-SET L UTF-8 utf8

6 The implementation il-aleph02-a18(8) >>dlib usm50 il-aleph02-18(8) USM50-YOELK>>dt il-aleph02-18(8) USM50-YOELK>>grep UTF-8 pc_tab_exp_field.eng OBJECT-CHAR-SET L UTF-8 utf8 Here we have added the necessary line to $data_tab/pc_tab_exp_field.eng in the Administrative library This is what is instructed in the implementation notes of the rep change.

7 Character Set After adding the line to pc_tab_exp_field.eng the UTF-8 character set may be chosen in the “3. Technical Data” tab of the Digital Object

8 The PDF document Here is our PDF document with Hebrew, Arabic and English Hebrew Arabic Latin

9 Sample First we add the record as a digital object with type “VIEW” and character set UTF-8. From the objects list we click “Indexing” while the VIEW object is selected

10 Sample After clicking “Indexing” from the objects list we have an object with type “INDEX” and also character set UTF-8.

11 Viewing the TXT full text index In the browser tab of the Object we see that the characters appear correctly Before this fix often asterisks would appear instead of the actual characters

12 Performing a search We will now search in the GUI via the full text index for two words, one in Arabic and one in Hebrew, which are in the PDF document

13 Performing a search We will now search in the web via the full text index for two words, one in Arabic and one in Hebrew, which are in the PDF document

14 Search results The correct record is found in GUI

15 Search results The correct record is found in Web

16 Search results Here is the PDF from the record in the web