Presentation is loading. Please wait.

Presentation is loading. Please wait.

RightFind™ XML for Mining- One Cross-Publisher Initiative to Empower Text Mining Roy S Kaufman, Managing Director, New Ventures, CCC.

Similar presentations


Presentation on theme: "RightFind™ XML for Mining- One Cross-Publisher Initiative to Empower Text Mining Roy S Kaufman, Managing Director, New Ventures, CCC."— Presentation transcript:

1 RightFind™ XML for Mining- One Cross-Publisher Initiative to Empower Text Mining
Roy S Kaufman, Managing Director, New Ventures, CCC

2 Copyright, simplified. Remove this
Global content and licensing solutions that make copyright work for everyone Corporate researchers sharing journal articles to support drug discovery Publishers seeking permission to use third-party content in new works Course creators preparing materials for student readings 950+ million rights, 12,000+ rightsholders, 35,000 customers in 140 countries Based in Danvers MA USA, with international subsidiary, RightsDirect, based in Amsterdam with presence in Tokyo One of EContent’s “100 Companies that Matter Most” in digital content for last 7 years Named one of Outsell’s “10 to Watch” in the search, aggregation and syndication segment Copyright Clearance Center, or CCC, is a global rights broker that manages more than 600 million individual rights. CCC was started more than 30 years ago as a not-for-profit organization. CCC has relationships with similar organizations, or RROs, around the world, through which we obtain valuable non-US titles for inclusion in our licenses. CCC is dedicated to the progress of collective licensing efforts around the world and is an active member of the International Federation of Reproduction Rights Organisations (IFRRO). CCC serves as a thought leader on copyright-related issues, providing licensing solutions that serve both copyright holders and the people who use their content. For 7 straight years, CCC has been named to Econtent ‘s list “100 Companies That Matter Most” in the digital content industry. CCC also joined Google, Yahoo, Microsoft and 6 other organizations named by research specialist Outsell as one of the “10 to Watch” in search, aggregation and syndication.

3 Making Copyright Work Rightsholders Content Users Licensing Solutions
Rights Management Content Delivery Copyright Education 950+ million rights from: Publishers Authors Agents Creators 35,000 companies Workers worldwide 1,200 colleges and universities Publishers and Authors

4 CCC in the World of Text Mining this goes to Eefke
Our product is like High Octane gasoline for Text Miners. Companies already have a text mining tool but it runs poorly with out gas. You don’t need to become proficient at selling text mining, but you’ll need to know all about gasoline…

5 Text Mining Today – Example Workflow
Manual work Text mining tools Search Get permission Download PDFs Convert PDFs Import into text mining software Search Search Get permission Get permission Download PDFs Download PDFs Convert PDFs Convert PDFs Run queries Import into text mining software Import into text mining software View results Perform search Obtain permission from publishers to mine full text for commercial use Requires automated tool or custom software to download in bulk Requires text mining permission from multiple publishers Requires content storage and feed management PDF is converted to a “blob of text” No tags Loss of metadata Low fidelity of content References induce noise Requires structuring text into XML Article text does not have “fields” Combining content from multiple sources takes time to normalize the metadata Here’s an example of a text mining workflow based on the information gathered in our research with text miners in the commercial life sciences. Recapping the challenges to researchers: Difficult to obtain full-text XML Difficult to integrate content into text mining platforms Multiple sets of terms, conditions and file formats Hard to negotiate and manage multiple publisher feeds No single solution addresses these issues until now…Our service is used to automate that laborious manual process on the left so that you can get better results faster with your text mining soluiton.

6 Introducing CCC’s XML for Mining Service
Build a collection of full-text articles in XML format for mining CCC’s Text Mining Service CCC’s text mining service expands the capability of companies’ text mining efforts beyond article abstracts and Open Access articles, allowing researchers to search and download the full-text from a single source, eliminating the need to manually find, acquire, license and convert articles from disparate publishers and other online sources. Enables researchers to quickly and efficiently create collections of full-text articles, from multiple publishers, in XML format for text mining. CCC’s text mining service is specifically designed to allow users to access and obtain machine readable content formatted in XML for loading into text mining systems such as Linguamatics I2E or IBM Watson. CCC uses the JATS format for its XML files, enabling mining tools to easily ingest the content from our system. Text Mining Software

7 CCC Integration with I2E Too Detailed
Automatically index in Linguamatics I2E Index Directly CCC’s Text Mining Service The integrated CCC and Linguamatics I2E solution enables researchers to spend less time gathering and formatting content into a mineable form so they can spend more time querying and analyzing results. Linguamatics I2E

8 Benefits of CCC’s Text Mining Service
Improves the results of your text mining efforts Saves time and money for corporations; large, small, established and start-up Ensures copyright compliance Improves the results of your text mining efforts Enables text miners to go beyond the abstract level to search, download and mine full-text articles in XML format from both company subscriptions as well as unsubscribed published material. CCC’s service gives you more accurate and richer results enabling you to make discoveries that can only be found in the full text. Ensures copyright compliance Because all of the content in the service is pre-authorized for commercial text mining, you get the peace-of-mind that your text mining projects comply with copyright, minimizing your organization’s infringement risk. Saves time and money Aggregates full text article content and normalizes metadata from multiple publishers into a secure cloud for fast and easy access, reducing the time and costs associated with article conversions, content management, and negotiations with publishers. CCC’s service accelerates access to article collections for mining, giving text mining professionals more time to focus on analysis and discovery.

9 Features of CCC’s Text Mining Service
Enables search within sections of the body text Identifies keyword hits in article excerpts to ensure a good match Enables the discovery of relevant unsubscribed content Provides uniform terms and conditions for mining Integrates with text mining tools. Employs API for additional workflow integrations

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34 Thank you! Roy S Kaufman rkaufman@copyright.com


Download ppt "RightFind™ XML for Mining- One Cross-Publisher Initiative to Empower Text Mining Roy S Kaufman, Managing Director, New Ventures, CCC."

Similar presentations


Ads by Google