Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part of the Commerce Business Apps Challenge We're challenging developers to look for innovative ways to utilize.

Similar presentations


Presentation on theme: "Part of the Commerce Business Apps Challenge We're challenging developers to look for innovative ways to utilize."— Presentation transcript:

1 Part of the Commerce Business Apps Challenge http://docbusinessapps.challenge.gov/ We're challenging developers to look for innovative ways to utilize DOC and other publicly available data to help businesses identify opportunities, grow, enhance productivity and create jobs. $10,000 USD in prizes (1 st - $5,000; 2 nd - $3,000; and 3 rd – $2,000) Ends: April 30, 2012 @ 11:00 PM EDT

2 Introductions: Mike Kruger, DOC – Director of Digital Strategy (Host) Christopher Leithiser (pronounced LightHizer), USPTO – IT Specialist (Presenter) Chris.Leithiser@uspto.gov (703) 756-1244 Office If you have questions regarding the USPTO Patent and Trademark Bulk Data available from Google, Inc. for no charge, send them to: IPD@uspto.govIPD@uspto.gov 2

3 Agenda:  Open Government Initiative / Data.gov / Google, Inc.  (2) Datasets:  Innovative Ideas:  Mash-ups:  Questions: 3

4 Open Government Initiative / Data.gov / Google, Inc.: PAST PRESENT FUTURE 4

5 Datasets: U.S. Patent Grant Bibliographic Text (2001 to Present) – Part (1 of 2): Contains the bibliographic text (i.e., front page) of each patent grant issued weekly (Tuesdays) from January 2001 to Present (excludes images/drawings). The file formats are Standard Generalized Markup Language (SGML) in accordance with the U.S. Patent Grant Version 2.4 Document Type Definition (DTD) and eXtensible Markup Language (XML) in accordance with the U.S. Patent Grant Version 2.5; 4.0 International Common Element (ICE); 4.1 ICE; and 4.2 ICE Document Type Definitions (DTDs). XML Resources at the USPTO: http://www.uspto.gov/products/cis/patents_xml.jsp (These are being updated).http://www.uspto.gov/products/cis/patents_xml.jsp This product includes a pgbyyyymmdd_wknn.zip or ipgbyyyymmdd_wknn.zip file for each week [where "yyyymmdd" is a Tuesday issue date and "nn" is a two-digit, fixed-length number (with leading zero) representing the sequentially-numbered week of the year]. Within each weekly zip file are three (3) files: pgbyyyymmdd.xml or ipgbyyyymmdd.xml (Bibliographic information in XML ICE); pgbyyyymmddlst.txt or ipgbyyyymmddlst.txt (List of patent grant numbers in ascending order); pgbyyyymmddrpt.txt or ipgbyyyymmddrpt.html (Statistical/summary report) Approximately 4,000 patent grants per week. Approximately 5 MB per weekly zipfile. Available from Google: http://www.google.com/googlebooks/uspto-patents-grants-biblio.html orhttp://www.google.com/googlebooks/uspto-patents-grants-biblio.html Available directly from the USPTO: https://eipweb.uspto.gov/2012/PatentGrantBibICEXML/ https://eipweb.uspto.gov/2005/PatentGrantBibICEXML/ https://eipweb.uspto.gov/2004/PatentGrantBibXML/ https://eipweb.uspto.gov/2003/PatentGrantBibXML/ https://eipweb.uspto.gov/2002/PatentGrantBibXML/ https://eipweb.uspto.gov/2001/PatentGrantBibSGML/ 5

6 Datasets: U.S. Patent Grant Bibliographic Text (1976 to 2001) – Part (2 of 2): Contains the bibliographic text (i.e., front page) of each patent grant issued weekly (Tuesdays) from January 1976 to December 2001 (excludes images/drawings). The file format is a subset of the Green Book, ASCII text: https://eipweb.uspto.gov/1976/PatentGrantFullTextAPS/PatentFullTextAPSDoc_GreenBook.pdf It includes patent number, series code and application number, type of patent, filing date, title, issue date, inventor information, assignee name at time of issue, foreign priority information, related US patent documents, classification information, U.S. and foreign references, attorney, agent or firm/legal representative, Patent Cooperation Treaty (PCT) information, abstract, and if present Statement of U.S. Government Interest. This product includes a yyyy.zip file for each year (1976 to 2001). All of the weekly files were concatenated into an annual file. Within each annual zip file is (1) file: yyyy.dat (Bibliographic information in ASCII); EXCEPTION 1: Beginning 09/03/1996 we also began providing the weekly zip files: (e.g., pba19960903_wk36.zip which contains: pba19960903.txt) EXCEPTION 2: Beginning 01/07/1997 the weekly files appear as pba19970107_wk01.zip which contains: pbayyyymmdd.txt (Bibliographic information in ASCII); pbayyyymmddlst.txt (List of patent grant numbers in ascending order); pbayyyymmddrpt.txt (Statistical/summary report) Approximately 4,000 patent grants per week. Approximately 1.6 GB total. Available from Google: http://www.google.com/googlebooks/uspto-patents-grants-biblio.html orhttp://www.google.com/googlebooks/uspto-patents-grants-biblio.html Available directly from the USPTO: https://eipweb.uspto.gov/2001/PatentGrantBibAPS/ https://eipweb.uspto.gov/1977/PatentGrantBibAPS/ https://eipweb.uspto.gov/1976/PatentGrantBibAPS/ 6

7 Datasets: U.S. Patent Application Publication Bibliographic Text (March 15, 2001 to Present): Contains the bibliographic text (i.e., front page) of each patent application publication (non-provisional utility and plant) published weekly (Thursdays) from March 15, 2001 to Present (excludes images/drawings). The file formats are eXtensible Markup Language (XML) in accordance with the U.S. Patent Application Version 1.5; 1.6; 4.0 International Common Element (ICE); 4.1 ICE; and 4.2 ICE Document Type Definitions (DTDs). XML Resources at the USPTO: http://www.uspto.gov/products/cis/patents_xml.jsp (These are being updated).http://www.uspto.gov/products/cis/patents_xml.jsp This product includes a pabyyyymmdd_wknn.zip or ipabyyyymmdd_wknn.zip file for each week [where "yyyymmdd" is a Thursday publication date and "nn" is a two-digit, fixed-length number (with leading zero) representing the sequentially-numbered week of the year]. Within each weekly zip file are (3) files: pabyyyymmdd.xml or ipabyyyymmdd.xml (Bibliographic information in XML ICE) pabyyyymmddlst.txt or ipabyyyymmddlst.txt (List of published patent application numbers in ascending order) pabyyyymmddrpt.txt or ipabyyyymmddrpt.html (Statistical/summary report) Approximately 5,000 patent application publications per week. Approximately 2.7 MB per weekly zipfile. Available from Google: http://www.google.com/googlebooks/uspto-patents-applications-biblio.html orhttp://www.google.com/googlebooks/uspto-patents-applications-biblio.html Available directly from the USPTO: https://eipweb.uspto.gov/2012/PatentApplBibICEXML/ https://eipweb.uspto.gov/2005/PatentApplBibICEXML/ https://eipweb.uspto.gov/2004/PatentApplBibXML/ https://eipweb.uspto.gov/2003/PatentApplBibXML/ https://eipweb.uspto.gov/2002/PatentApplBibXML/ https://eipweb.uspto.gov/2001/PatentApplBibXML/ 7

8 Innovative Ideas: Homogenize the patent grant bibliographic text data (i.e., make it all the same format). Same for the patent application publication bibliographic data. Capture patent grant bibliographic text data from 1790 to 1975 using the image data. Build a text searchable database (updated weekly) that includes both of the datasets discussed today. Search queries can be saved. Result sets can be saved/extracted/tailored. Build a text searchable database (updated weekly) that includes subsets of both of the datasets discussed today. (e.g., Green Technology related). Same ideas as above, but use full-text (75 MB/104 MB per week) or full-text with embedded images (1.4 GB/1.5GB per week): http://www.google.com/googlebooks/uspto-patents.html 8

9 Mash-ups: Combine USPTO applicant/inventor information with other USPTO datasets (e.g., with USPTO assignments (ownership) data): http://www.google.com/googlebooks/uspto-patents-assignments.htmlhttp://www.google.com/googlebooks/uspto-patents-assignments.html or https://eipweb.uspto.gov/2012/PatentAsgnDailyXML/ https://eipweb.uspto.gov/2011/PatentAsgnAnnlRetroXML/ Combine USPTO patent grants and patent application publications with other DOC data (e.g., Census or Economic data). 9

10 Questions: If you have questions regarding the USPTO Patent and Trademark Bulk Data available from Google, Inc. for no charge, send them to: IPD@uspto.govIPD@uspto.gov 10


Download ppt "Part of the Commerce Business Apps Challenge We're challenging developers to look for innovative ways to utilize."

Similar presentations


Ads by Google