Presentation is loading. Please wait.

Presentation is loading. Please wait.

You say potato, I say potahto: Ontological engineering applied within the Biomedical Informatics Research Network J. A. Turner1, C. Fennema-Notestine2,

Similar presentations


Presentation on theme: "You say potato, I say potahto: Ontological engineering applied within the Biomedical Informatics Research Network J. A. Turner1, C. Fennema-Notestine2,"— Presentation transcript:

1 You say potato, I say potahto: Ontological engineering applied within the Biomedical Informatics Research Network J. A. Turner1, C. Fennema-Notestine2, M. E. Martone3, A. R. Laird4, J. S. Grethe3, W. Bug5, A. Gupta6, C. Bean7; 1Psychiatry and Human Behavior, Univ California, Irvine, Irvine, CA, 2Psychiatry, UCSD, La Jolla, CA, 3Neuroscience, UCSD, La Jolla, CA, 4Health Sciences Center, Univ Texas, San Antonio, TX, 5Anatomy, Drexel Univ. Coll of Med., Philadelphia, PA, 6Super Comp. Ctr., UCSD, La Jolla, CA, 7NCRR, NIH, Bethesda, MD. 1 100.6 The Biomedical Informatics Research Network (BIRN) is developing a federated infrastructure linking human and mouse imaging experiment databases. BrainMap is a repository of published human neuroimaging studies. Both efforts aim to enable the research community to formulate questions (e.g., “What are the cortical and subcortical dysfunctions related to negative symptom severity in schizophrenia?”) and retrieve relevant results from independent laboratories and networked data resources for further analysis. Figure. (left) The distributed nature of BIRN sites developing the data sharing and analysis infrastructure. Standardized terminologies and a shared semantic framework, which together define an ontology, are required to integrate and retrieve data from such distinct resources. Within BIRN we have identified several domains requiring standardized ontologies: 1) Assigning standardized geometric locations and relationships to neuroanatomical structures across both temporal and spatial scales (e.g., dendrite to whole brain images across disease progression and development) drives the need for a normalized neuroanatomical nomenclature and a mereotopological ontology integrated into a brain atlasing environment. 2) Translation of findings in animal disease models into a human clinical context requires capturing detailed imaging and experimental parameters; particularly in human studies, formal annotations of cognitive function measured in the clinic (e.g., measures of memory loss) and experimental measures of cognitive processes (e.g., working memory fMRI paradigms) must link to a common ontology. This presentation looks at #2 in more detail. Distributed data in the Biomedical Informatics Research Network Integration of different methods and measures Data Integration A user’s query to a federated database infrastructure requires integrating different kinds of data (left). The software architecture of such integration systems consists of a two-part middleware, called the wrapper and the mediator, that are between the information sources and the user. The wrapper converts the data from the respective information source to a form that the mediator can accept and manipulate. The mediator converts a user’s query into smaller sub-queries that are sent to each source, and integrates the results returned from each source. Example Query of Federated Database PET & fMRI Are chronic, but not first-onset patients, associated with superior temporal gyrus dysfunction? Integrated View Receptor Density BrainMap Web PubMed, Expasy, fMRIDC Wrapper Structure Wrapper Clinical Wrapper Mediator Figure (above) Understanding both the form and content of the different databases motivates the development of the ontologies. Examples and their implications are presented here. Neuroimaging Results: BrainMap BrainMap (www.brainmap.org) is an online database of published functional neuroimaging experiments in the form of stereotactic (x,y,z) coordinates of results. It is a tool to rapidly retrieve studies in specific research domains, such as language, memory, and attention. BrainMap also archives each paper’s associated meta-data: information on subjects, conditions, experimental paradigms, etc. BrainMap can be used to rapidly retrieve published studies in specific behavioral domains, such as language, memory, and attention (see Figure below). BrainMap has been in step-wise development at the Research Imaging Center in San Antonio since 1988 and in use since 1992. The structure of BrainMap data entry involves three levels of information: paper level (authorship, etc.), experiment level, and locations (coordinate) level. For the purposes of the BrainMap database, an experiment refers only to the comparison of two (or more) imaged conditions that result in a statistical parametric image (SPI). The locations level of information is the Talairach (x,y,z) coordinates (e.g., centers-of-mass of sites of activation) extracted from the SPIs and are entered explicitly into the database. Left: The current behavioral domains for BrainMap. Experiments are presented in the results of a behavioral domain, context, experimental paradigm, subject group, instructions, stimulus, etc. Right: Example results of a BrainMap query, showing the location of significant activation loci from many published experiments. Neuroimaging datasets: FBIRN Human Imaging Database The BIRN XCEDE schema was developed to share imaging data of diverse formats from different scanning sites. It provides an extensive metadata hierarchy for describing and documenting the technical details of human imaging studies. The XML schema was organized in correspondence with the BIRN Human Imaging Database schema (HID). This allows for an interchangeable source-sink relationship between the database and the XML files, which live with the actual data files on the Shared Resource Broker (SRB). Meta-data about the images and experimental paradigms MUST be included, and transparent to the general research community. The FBIRN informatics infrastructure is shown below: in 1), imaging data is collected, described and uploaded. In 2), clinical information regarding the subject is included. In 3), users query the system to find datasets around the country; and in 4), standardized analyses are run on the selected data and results are stored in the SRB/HID system. The FBIRN Human Imaging Database is built to store information regarding each subject’s data from many different studies. It includes demographic information (age, gender, handedness), clinical measures, and visit information, as well as links to the functional and structural imaging data on a separate system (Storage Resource Broker, SRB). This has led to the identification of the following concept domains which require standardized terminologies and must be represented in an ontology to include relationships between concepts: 1.Clinical measures 2.Cognitive Taxonomies 3.Cognitive task descriptions 4.Scanning parameters 5.Neuroanatomical nomenclature HID SRB fMRI Scanner Results with standard descriptions in HID Results Images in SRB FIPS Results FSL AutomatedFSL Automated Analysis Pipeline (FIPS) Query Clinical/Other Data entry Other users querying the federated database system Automated Image Upload to SRB/HID for sharing QC/QA Automatic Analysis XML Wrapped Images Scan Metadata/descriptions 1. 2. 3. 4. A potential user query for the federated HID and BrainMap system is: “Find all the right handed, ‘positive’ symptom schizophrenic subjects with fMRI data from a working memory task.” This involves determining Clinical aspects: What clinical assessments measure positive schizophrenia symptoms? Cognitive taxonomies: Which tasks are ‘working memory tasks’? Demographics: We need to be able to find the tables in the database which contain: - Age - Gender - Handedness - Diagnosis, etc. Scanning parameters: - Type of scan: structural, functional - Other imaging parameters, e.g.: TR, TE, Number of slices? (whole brain or single-slab?), Slice thickness/gap thickness, Slice acquisition order (interleaved or serial), for analysis and understanding the results. Brain Cerebrum Temporal Mesial temporal Hippocampus Cerebral cortex CVLT Task and score description Frontal Cognitive impairment Cognition Assessment Neuropsychology Amnesia Memory Learning Figure (left) shows an example of a potential clinical assessment concept hierarchy with a link to the relevant neuroanatomical concept hierarchy. The California Verbal Learning Test (CVLT) is an assessment of cognitive impairment in learning and memory, which is associated with hippocampal function. Concept Domains involved in a Use Case Standardization Challenges Data Integration Progress Find all the right-handed, ‘positive’ symptom schizophrenic subjects with 3T fMRI data from a working memory task. Challenge 0: Different information in different databases. This query when sent to the HID will return individual subjects’ datasets, with sufficient meta-data to be able to combine the datasets into an analysis. When sent to BrainMap, it will return results of studies that fit these criteria. BrainMap results HID results Challenge 1: Identifying subject characteristics in a standard way. Diagnosis may be the result of a SCID measurement or stored in a “diagnosis” field. Each database has to identify where this information is. This query could just as well be “negative symptom” or “depressed subjects” or any subset of the clinical measures. Which clinical scales measure which aspects of the clinical symptoms must be explicitly identified. This also requires updating, as new concepts regarding the disease are developed. A specific example: “Handedness” in BrainMap refers to the group of subjects in a study— “right handed” means only right handed subjects were included, “mixed” means the subject group included both left and right handed subjects. In the HID datasets, “handedness” refers to the individual subject’s characteristic— “mixed” means the subject showed neither left nor right-handed dominance. Challenge 2: Identifying experimental equipment and collection parameters. A key distinction is between the abstract description of the data collection method and the instantiation of it. The HID is built to have “lab book” characteristics, including timing mistakes and information regarding missing data, etc. Challenge 3: Standardized experimental paradigm descriptions. Integrated View Designer Integrating views of results require addressing the challenges noted above across different database schema. Figure (left): Successful query for data from several projects regarding the neostriatum, from the Cell Centered Database (ccdb.uscd.edu). Categorizations such as “working memory paradigm” or “attention paradigm”, while likely to be the first thing users want in their queries, are likely to be neither consistent nor constant over time. What was considered a simple attention paradigm two years ago may now be understood to be far more complex, for example, as the theories of cognitive processes evolve. Therefore, a much less hierarchical approach is being taken, in which paradigms are defined by their physical attributes. Links from paradigms to behavioral domains will need to be more flexible, so that “working memory paradigms” can be redefined as needed. Figure (right above): An example working memory task, the Serial Item Paradigm, is a behavioral paradigm with a particular definition that distinguishes from the N-back or other paradigms. 123


Download ppt "You say potato, I say potahto: Ontological engineering applied within the Biomedical Informatics Research Network J. A. Turner1, C. Fennema-Notestine2,"

Similar presentations


Ads by Google