Presentation is loading. Please wait.

Presentation is loading. Please wait.

Use of Python for VIVO Application Programming Mike Conlon, Nicholas Rejack and Laura Guazzelli UF Clinical and Translational Science Institute, University.

Similar presentations


Presentation on theme: "Use of Python for VIVO Application Programming Mike Conlon, Nicholas Rejack and Laura Guazzelli UF Clinical and Translational Science Institute, University."— Presentation transcript:

1 Use of Python for VIVO Application Programming Mike Conlon, Nicholas Rejack and Laura Guazzelli UF Clinical and Translational Science Institute, University of Florida Why Python? Python and VIVO Useful Libraries References Python 1,2 is a popular, easy to learn language very well suited for use with VIVO and the semantic web. Python is available for Mac, Windows and Linux, has simple procedural syntax and clear syntax for object oriented development. Python is trivial to install and standard installations include and integrated development environment, IDLE. Python is open source, and has a strong development community. Many libraries are included in the standard distribution and many more libraries are available through standard python archives. Installing additional libraries typically requires a single command. Python has a very short learning curve. Experienced programmers can install and write python programs on their first day. Python has outstanding support for data structures, the Internet, exception handling, XML, string manipulation, CSV files, and interaction with other systems. Python is very fast to compile and execute. A 200,000 line Excel CSV can be read, and summarized in a few seconds. Simple Python functions can make SPARQL queries, and template libries can be used to make RDF. Python associative arrays (dictionaries) can store data from VIVO and provide extremely efficient look-up. A single query can return all people in VIVO which can then be placed in a dictionary. Subsequent code can refer to the dictionary without having the make additional queries. Python code can quickly retrieve RDF, parse it, find additional URIs and retrieve additional RDF, thereby following demantic web graphs and identifying data properties and values. At the University of Florida, Python is used to report from VIVO logs, generate web pages, and create RDF for ingest of people, papers, and positions held. The techniques demonstrated here can be used to ingest and report on any data in VIVO. Code from these examples will be available at VIVO repositories. Some useful Python libraries for use with VIVO: Pybtex – read bibtex files into python structures Tempita – simple, flexible templates Csv – read and write CSV files Minidom – read, manage, write XML data Re – regular expressions in python Datetime – ISO standard datetime processing pyRTF – make RTF documents Urllib – create URLs, fetch web content Entrez – query, read, process PubMed files Rdflib – tools for working with RDF Vivotools – UF tools for SPARQL query, generate VIVO URIs 1 Python Home Page. www.python.org Accessed 8/16/2012.www.python.org 2 Ceder, VL The Quick Python Book, 2 nd ed. Greenwich, CT: Manning, 2010, 336 pgs. ISBN 97819335182207 3 Conlon, M, Barnes, CP, Sposato, V, Rejack, N, Schmidt, E, Collante, W, Guazzelli, L, Williams, S, Raum, N. Implementation of VIVO at the University of Florida. Conference Poster, VIVO 2012, Miami, FL. 4 Google App Engine Home Page. https://appengine.google.com Accessed 8/16/2012. https://appengine.google.com Making Reports Adding Papers to VIVO Getting Started Visit www.python.org, download python and click to install. Get a good, quick read python book. Spend a day writing code. Spend a day studying code examples. Write something simple, make it work. Write something more sophisticated. Ask questions. Use Google to find Python examples and additional libraries. Use libraries to build on existing functions.www.python.org The UF code example use Python 2.7.3. We use Python 2.7.3 because it is supported by Google App Engine 4. Using Google App Engine, you can create on-line python web sites and applications using Google infrastructure at no-cost. Making Web PagesAdding People to VIVO From a spreadsheet, RDF can be generated by Python to add people to VIVO, linking them to their home department. Once people are in and identified via UFID, subsequent scripts can attach grants, papers, photos, courses taught, positions held. See the UF Implementation Poster 3 for additional information on processes used at UF to generate VIVO data to represent scholarship at UF. Logs for week of 2012-08-09 have 342789 entries Top five users dsr 236086 aa238@ufl.edu 49853 people 20139 ankitbaderiya@ufl.edu 14403 nettiepa@ufl.edu 14385 Counts of Process HARVEST 258296 MANUAL 84442 Counts of ADD/SUB ADD 219256 SUB 123482 Top five subjects 338 320 315 276 269 Top five predicates 99709 72817 35856 22458 22456 Top five Objects 2012-08-10-04:00 45157 2012-07-13-04:00 44927 22740 22454 Python Pubs version 1.0 9573 Figure 1. Log reports from python script def counts(s,log,trim=100000000): names = ["Date","Process","User","ADD/SUB","Subject","Predicate","Object "] ix = names.index(s) print "Counts of "+s things = {} for row in log: try: thing = row[ix] things[thing] = things.get(thing,0) + 1 except: continue i=0 for thing in sorted(things, key =things.get, reverse=True): i = i + 1 if i > trim: break print thing,things[thing] Figure 2. Counting function from logging script Figure 3. Web page from VIVO data Figure 4. Complete code for web page Mo, J, Ding, M, Maizels, M, Ahn, A H, "A Brain Representation of Persistent Throbbing in a Patient With Chronic Migraine: Evidence for the Modulation of Attention and Sensory Processing", Headache, 52, 2012, pp 901 VIVO uri http://vivo.ufl.edu/individual/n7302907706 http://vivo.ufl.edu/individual/n7302907706 Gu, Qun Jane, Tang, Adrian, Xu, Zhiwei, Chang, Mau-Chung Frank, "A D-Band Passive Imager in 65 Nm Cmos", IEEE Microwave and Wireless Components Letters, 22, 2012, pp 263-265. doi: 10.1109/LMWC.2012.2192720 VIVO uri http://vivo.ufl.edu/individual/n7458921860 http://vivo.ufl.edu/individual/n7458921860 Gurbuz, Feyza, Pardalos, Panos M, "A Decision Making Process Application for the Slurry Production in Ceramics Via Fuzzy Cluster and Data Mining", Journal of Industrial and Management Optimization, 8, 2012, pp 285-297. doi: 10.3934/jimo.2012.8.285 VIVO uri http://vivo.ufl.edu/individual/n9478748156http://vivo.ufl.edu/individual/n9478748156 Zager, Jonathan S, Chai, Christy Y, Beasley, Georgia M, Deneve, Jeremiah L, Chen, Y Ann, Marzban, Suroosh S, Grobmyer, Stephen R, Rawal, Bhupendra, Tyler, Douglas S, Hochwald, Steven N, "A Multi-Institutional Experience of Repeat Regional Chemotherapy for Recurrent Melanoma of Extremities", Annals of Surgical Oncology, 19, 2012, pp 1637-1643. doi: 10.1245/s10434-011-2151- z VIVO uri http://vivo.ufl.edu/individual/n1706363420http://vivo.ufl.edu/individual/n1706363420 Figure 5. Papers being added to VIVO Python scripts read bibtex, use SPARQL calls to find available VIVO URIs, and templates to generate RDF. UF authors are identified and papers linked to profiles. Journals, authors and publishers are created if needed. Python string functions improve and standardize text. Reports summarize actions taken. Figure 6. A profile created by Python software


Download ppt "Use of Python for VIVO Application Programming Mike Conlon, Nicholas Rejack and Laura Guazzelli UF Clinical and Translational Science Institute, University."

Similar presentations


Ads by Google