Presentation is loading. Please wait.

Presentation is loading. Please wait.

CCEGA InformaticsHemminger CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science Supported in part by NIH Grant 5P20RR020751-02.

Similar presentations


Presentation on theme: "CCEGA InformaticsHemminger CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science Supported in part by NIH Grant 5P20RR020751-02."— Presentation transcript:

1 CCEGA InformaticsHemminger CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science Supported in part by NIH Grant 5P20RR020751-02

2 CCEGA InformaticsHemminger Participants Roger Akers, Shepp Center Peter DeSaix, Epidemiology Xiaojun Guan, RENCI Kevin Gamiel, RENCI Barrie Hayes, Health Sciences Library Brad Hemminger (chair) School of Information & Library Science Clark Jeffries, RENCI Joel Kingsolver, Biology Lavanya Ramakrishnan, RENCI David Threadgill, Genetics Kirk Wilhelmsen, Genetics Dong Xiang, Lineberger Cancer Center

3 CCEGA InformaticsHemminger Aims Universal data model sharable by everyone. Standardized, independent methods, so location can be anywhere. Practical. Adoptable by many disparate groups for both new and legacy systems. Utilize existing domain standards, controlled vocabularies and ontologies (e.g. GO, MIAME, caBIG, …) Data repository should be safe and secure, with only controlled and accountable access by appropriate qualified entities.

4 CCEGA InformaticsHemminger Areas of Focus Development of common data model Determine ways the common data model can be implemented as a common shared digital repository that allows for the ingest of digital content from many varied sources (both existing projects and new projects), and controlled access by appropriate people and automated agents.

5 CCEGA InformaticsHemminger Areas of Focus cont’d Address practical issues of how such a repository could be utilized by different groups with different needs in different contexts. Demonstrate advantages of how usage of the repository would be advantageous to groups, to help encourage them to utilize it. Define security and privacy issues for the repository, and propose and implement methods to support this. Preservation and curation.

6 CCEGA InformaticsHemminger Overview Status quo (difficulties summarized in Kirk’s talk). Diagram and brief explanation of planned architecture. How labs, clinics, and analysis would interact with repository.

7 CCEGA InformaticsHemminger Issues: Lab and Clinic to Analysis Independent data management –Data security –Version control –Redundancy –Controlled access Clinical Laboratory Analysis ELSI

8 CCEGA InformaticsHemminger Analysis LAB ELSI Integration & Informatics Clinic CCEGA Model We want the integration of the data operations across the labs, clinics, and analysis

9 CCEGA InformaticsHemminger mapping Ingest mapping Output Lab Repository Data Store Analysis Methods Association Table Lab Permissions

10 CCEGA InformaticsHemminger Timeline First intramural workshop (spring 2005) Weekly meetings (beginning spring 2005) –Development of draft common model based on wealth of experience in local labs, and existing standards –Analysis of data requirements, and existing infrastructure at UNC. Internal interviews with labs Second intramural workshop (summer/fall 2005) –Present draft common model for review and feedback by UNC community

11 CCEGA InformaticsHemminger Timeline continued Extramural workshop (winter 2005) –Bring community of experts to UNC for discussions. –Learn in more detail about related work outside of UNC –Present our draft model to get feedback and criticism. Refine model Implement and test model using data from the three main projects identified in this grant. Think about and plan for how this model spreads. How to promote its use by groups with existing infrastructure as well as by new groups.

12 CCEGA InformaticsHemminger Common Data Model Survey schema/models in use by labs Develop set of general requirements Get ELSI and HIPAA requirements Develop generalized model capable of meeting needs Test model with data collection and analysis programs for alcoholism and addiction, breast cancer, and epidemiology studies that are part of the grant.

13 CCEGA InformaticsHemminger Initial Examples Epidemiology Specimen Collection and Tracking System (Roger) Alcoholism and Addiction Study (Kirk) Proteomics Core Facility General Model (Brad)

14 CCEGA InformaticsHemminger

15

16

17 Security Security will be designed into the CCEGA model and to implemented in the repository to provide protection of information, while still allowing researchers timely access to data. Data will be protected via trusted broker methodology. Information is made anonymous by use of randomly chosen keys assigned by the trusted broker. The assignment is made at the clinic-database interface. The coded key will be used to identify experimental data, while providing linkage to the source organism private information in a secure association table.

18 CCEGA InformaticsHemminger Accountability Access permissions will determine which entities are allowed access to which data. All access to data is tracked via logs. “Audit-readiness” will be maintained to respond quickly to an outside investigation and challenge with the goal of quick clearance. Regular or random internal security audits will be included in a management strategy. Documents used in audits include 24/7 logs, flowcharts of procedures, training documents, incident reports, etc.

19 CCEGA InformaticsHemminger Future (P50) Goals Comprehensive survey and publication of different schemas, architectures, controlled vocabularies/ontologies used by different groups. Comparison of similarities and differences. Digital content preservation planning. Study of what factors determine how well such models are adopted in this environment. Make publicly available the developed resources (data model, digital repository content, database structure/schema).

20 CCEGA InformaticsHemminger End


Download ppt "CCEGA InformaticsHemminger CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science Supported in part by NIH Grant 5P20RR020751-02."

Similar presentations


Ads by Google