Presentation is loading. Please wait.

Presentation is loading. Please wait.

Project Goal (from the proposal) The overall goal of this two-year project is to establish a comprehensive, easily accessible public resource database.

Similar presentations


Presentation on theme: "Project Goal (from the proposal) The overall goal of this two-year project is to establish a comprehensive, easily accessible public resource database."— Presentation transcript:

1

2 Project Goal (from the proposal) The overall goal of this two-year project is to establish a comprehensive, easily accessible public resource database of images, videos, and animations of cells from a variety of organisms, including both cell architecture and intracellular functionalities, as well as stimulate the economy through the creation and retention of 18 (7 full-time equivalents) positions and immediate deployment.

3 Team Caroline Kane Principal Investigator University of California Berkley John Murray Co-Principal Investigator University of Pennsylvania Janet Iwasa Co-Principal Investigator Harvard Medical School Joan Goldberg Executive Director American Society of Cell Biology David Orloff Manager, Image Library American Society of Cell Biology John Hufnagle Scientific Informatics Developer MBL

4 Expert Annotation—The Value Add 11 annotators They often solicit and upload images They are often in contact with the scientists who produced the images Gregory Antipa San Francisco State University Carrie Baker Brachmann Margaret I. Davis National Institutes of Health, National Institute on Alcohol Abuse and Alcoholism Keigi Fujiwara University of Rochester Catherine Galbraith National Institutes of Health Yu-Chen Hwang University of California, Santa Cruz Wallace Ip University of CincinnatiCollege of Medicine Caroline McKeown The Scripps Research Institute Linda Parysek University of Cincinnati College of Medicine Ginger Withers Whitman College Chris Woodcock University of Massachusetts Amherst

5 Annotation Information Image Description Ontology terms Attribution 1.Names 2.Pubmed Ids 3.Citations 4.links 5.dates Dimensional

6 Multiple Categories of Ontologies Categories including: – Biological Sources—NCBI, cell type, cellular component – Blological Context – biological process, molecular function – Imaging Methods – Sample Preparation Ontologies provide a controlled vocabulary Useful for searching, browse categorization

7 Ontologies NCBI Organism Classification (NCBITaxon) Gene Ontology (GO) – biological_process – molecular_function – cellular_component Cell Type (CL) Cell Line (MCC) Human Development (EHDA) Mouse Gross Anatomy (EMAP) Plant Growth (PO) Teleost Anatomy (TAO) Xenopus Anatomy (XAO) Zebrafish Anatomy (ZFA/ZFS) Human Disease (DOID) Mouse Pathology (MPATH) Biological Imaging Methods (BIM) …the project now controls this ontology

8 Image Lifecyle Image Data Upload Annotation Publish & Index Library Edit/Save Retract

9 System Components OMERO Image Repository Server OMERO Image Repository Server DB PostgreSQL Disk Index, Image Data Disk Index, Image Data Web Application Annotation Web Application Server (Harvard) Image Upload Library Browser Requests Annotation Browser Requests

10 Image Upload Submission Image Data Upload Annotation Publish & Index Library Edit/Save Retract

11 Image Data Upload Submitter downloads Upload Java application Raw image data files selected (105 image file formats supported) Submitter contact information supplied Submitter supplied image description (not visible in the Library) which contains technical image details to be used by the annotators Choose license type

12 Upload Process & Components Java Upload App Submitter Machine HTTP Importer Worker Process Importer Worker Process OMERO Image Repository Production Server (Harvard) DB PostgreSQL Disk Index, Image Data Disk Index, Image Data

13 Image Lifecyle Image Data Upload Annotation Publish & Index Library Edit/Save Retract

14 Annotation Process & Components OMERO Image Repository Server DB PostgreSQL Disk Index, Image Data Disk Index, Image Data Annotation Web Application (Django) Annotation Web Application (Django) Server (Harvard) Apache Server Apache Server

15 Image Lifecyle Image Data Upload Annotation Publish & Index Library Edit/Save Retract

16 Publish OMERO Image Repository Server DB PostgreSQL Disk Index, Image Data Disk Index, Image Data Annotation Web Application Server (Harvard) Publish Library Custom Indexing Plug-in Lucene Indexer Browser Publish

17 Indexing OMERO repository provides a way for developers to add their own custom indexing step in order to generate custom search indexing fields and values. Custom indexing plug-in, written in Java and configured into the OMERO system. Each image upon modification is presented to the custom plug-in

18 Cell Library Custom Indexing Generating Index Values Custom Lucene document index fields – Id – Ontology information for each term in each ontology category term id parent id ancestor ids term description synonym description – attribution (names, pubmed, citations, urls) – is_recommended (for front page/browse poster child image) – is_video – description – license type – publish date (useful for Recent browsing) – dimensions

19 Ontology Data Scripting Download Latest Ontology.obo file (Ruby) Download Latest Ontology.obo file (Ruby) Parse.obo file (Custom BioJava) Parse.obo file (Custom BioJava) JSON data Populate PostgreSQL ontology tables (Ruby) Populate PostgreSQL ontology tables (Ruby) BioPortal Ontology REST services DB PostgreSQL

20 Indexing Ontology Terms … "field_mappings" : [ { "module" : "web_annotation_module", "namespace" : "com.glencoesoftware.ilib.ann:ncbi", "name" : "NCBIORGANISMALCLASSIFICATION", "index_field_name_prefix" : "ncbi", "ontologies" : [ { "db_table_name" : "ncbis", "model_klass" : "Ncbi”, "onto_term_regex_pattern" : "NCBITaxon:[0-9]*","ontology_id" : "1023" } ] }, …. … "field_mappings" : [ { "module" : "web_annotation_module", "namespace" : "com.glencoesoftware.ilib.ann:ncbi", "name" : "NCBIORGANISMALCLASSIFICATION", "index_field_name_prefix" : "ncbi", "ontologies" : [ { "db_table_name" : "ncbis", "model_klass" : "Ncbi”, "onto_term_regex_pattern" : "NCBITaxon:[0-9]*","ontology_id" : "1023" } ] }, ….... com.glencoesoftware.ilib.ann:celltype CELLTYPE Ciliated Protist com.glencoesoftware.ilib.ann:ncbi NCBIORGANISMALCLASSIFICATION NCBITaxon: com.glencoesoftware.ilib.ann:celltype CELLTYPE Ciliated Protist com.glencoesoftware.ilib.ann:ncbi NCBIORGANISMALCLASSIFICATION NCBITaxon: Mapping fileAnnotation xml fragment

21 Additional Indexing Artifacts Generation of db data to support efficient Library browsing – Entries made for each ontology term in use

22 Image Lifecyle Image Data Upload Annotation Publish & Index Library Edit/Save Retract

23 System Components OMERO Server DB PostgreSQL Disk Index, Image Data Disk Index, Image Data Annotation Web Application Server (Harvard) Passenger Container Apache Jetty Servlet Container Jetty Servlet Container Library Web Service

24 Connecting to the OMERO Server OMERO Server Java OMERO Server Java Annotation Web Application (Django/Python) Server (Harvard) Passenger Container Jetty Servlet Container (8081,2,3,4,5) Library Web Service (Java) search get image annotation data convert video-to-flash get raw image bytes get OME-TIF image bytes search get image annotation data convert video-to-flash get raw image bytes get OME-TIF image bytes OMERO Ice Middleware (Java) OMERO Ice Middleware (Java) OMERO Ice Middleware (Python) OMERO Ice Middleware (Python) REST-like Apache R 08 OMERO Ice Middleware (Java) OMERO Ice Middleware (Java)

25 Library Basic Search Primary Weighting Secondary Weighting Secondary Weighting

26 Library Advanced Search

27 Advanced Search If the ontology search value is exact match for existing term, returns matches against term and descendant terms e.g. “rodentia” will match rat, mouse, etc. If the ontology search value does not match an existing ontology term a simple text match search against that ontology category is run

28 Library Browse Categories – Cell Process (GO biological_process) – Cellular Component (GO cellular_component) – Cell Type (cell type CL) – Organism (NCBITaxon) Sub-categories consist of all ontology terms currently annotated to images…captured during Indexing phase Efficiency (NCBI 500K+)

29 Some Image Sources Journals – Journal of Cell Biology – Molecular Biology of the Cell – The Plant Cell – Plant Physiology

30 Some Sources and Contributors Don W. Fawcett’s The Cell Some images from researchers with MBL ties – Clara Franzini-Armstrong – Rudolph Oldenburg

31 Programmatic Access Jetty web service interface is externally available. – Search – Image metadata – raw & OME-TIFF download formats

32 Statistics February stats – 6,635 Visits – 5,093 Absolute Unique Visitors – 31,609 Pageviews

33 Future Enhancements Themed collections with descriptive content Image tagging Faceted searching (SOLR)

34 Summary Research tool with raw image data available for future image processing Image Submissions always accepted…contact David Orloff


Download ppt "Project Goal (from the proposal) The overall goal of this two-year project is to establish a comprehensive, easily accessible public resource database."

Similar presentations


Ads by Google