Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scanned Documents LBSC 796/INFM 718R Douglas W. Oard Week 8, October 29, 2007.

Similar presentations


Presentation on theme: "Scanned Documents LBSC 796/INFM 718R Douglas W. Oard Week 8, October 29, 2007."— Presentation transcript:

1 Scanned Documents LBSC 796/INFM 718R Douglas W. Oard Week 8, October 29, 2007

2

3

4 Expanding the Search Space Scanned Docs Identity: Harriet “… Later, I learned that John had not heard …”

5 High Payoff Investments Searchable Fraction Transducer Capabilities OCR MT Handwriting Speech

6 The Big Picture Find the words Index the words Do ranked retrieval Use that system to find what you want

7 Some Issues Language-based search without language! –Shape codes Accuracy-selection effect of ranked retrieval –Poor recognition scatters in the query-term space Blind relevance feedback – Based on clean text Image-domain summaries

8 Some Applications Case management for litigation Duplicate detection for declassification productivity and anti-tiling Knowledge management from everything I have ever xeroxed or faxed

9 Some Applications Legacy Tobacco Documents Library –http://legacy.library.ucsf.edu/http://legacy.library.ucsf.edu/ Google Books –http://books.google.com/http://books.google.com/ George Washington’s Papers –http://ciir.cs.umass.edu/irdemo/hw-demo/http://ciir.cs.umass.edu/irdemo/hw-demo/


Download ppt "Scanned Documents LBSC 796/INFM 718R Douglas W. Oard Week 8, October 29, 2007."

Similar presentations


Ads by Google