Presentation is loading. Please wait.

Presentation is loading. Please wait.

CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science.

Similar presentations


Presentation on theme: "CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science."— Presentation transcript:

1 CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science

2 Participants Roger Akers, Shepp Center Peter DeSaix, Epidemiology Xiaojun Guan, RENCI Kevin Gamiel, RENCI Barrie Hayes, Health Sciences Library Brad Hemminger (chair) School of Information & Library Science Clark Jeffries, RENCI Joel Kingsolver, Biology Lavanya Ramakrishnan, RENCI David Threadgill, Genetics Kirk Wilhelmsen, Genetics Dong Xiang, Lineberger Cancer Center

3 Aims Universal data model sharable by everyone. Standardized, independent methods, so location can be anywhere. Practical. Adoptable by many disparate groups on campus for new and legacy systems. Utilize existing domain standards, controlled vocabularies (GO, MIAME, caBIG, …) Data repository should be safe and secure, with only controlled and accountable access by appropriate qualified entities.

4 Areas of Focus Development of common data model Determine ways the common data model can be implemented as a common shared digital repository that allows for the ingest of digital content from many varied sources (both existing projects and new projects), and controlled access by appropriate people and automated agents. Address practical issues of how such a repository could be utilized by different groups with different needs in different contexts. Demonstrate advantages of how usage of the repository would be advantageous to groups, to help encourage them to utilize it. Define security and privacy issues for the repository, and propose and implement methods to support this.

5 Overview Difference between status quo and what is proposed. How labs, clinics, and analysis would interact with repository. Tie-in to Kirk’s examples.

6 Timeline First intramural workshop (spring 2005) –Weekly meetings (beginning spring 2005) Analysis of data requirements, and existing infrastructure at UNC. Internal interviews with labs Development of draft common model based on wealth of experience in local labs, and existing standards. –Second intramural workshop (summer/fall 2005) Present draft common model for review and feedback by UNC community.

7 Timeline continued –Extramural workshop (winter 2005) Bring community of experts to UNC for discussions. Learn in more detail about related work outside of UNC Present our draft model to get feedback and criticism. –Refine model –Implement and test model using data from the three main projects identified in this grant. –Think about and plan for how this model spreads. How to promote its use by groups with existing infrastructure as well as by new groups.

8 Common Data Model Survey schema/models in use by labs Develop set of general requirements Get ELSI and HIPPA requirements Develop generalized model capable of meeting needs Test against X,Y,Z (from grant)

9 Initial Examples Kirk’s (from Kirk) Roger’s (from Roger) Brad’s (proteomics) New model….(from Access database)

10

11 Security Security will be designed into the CCEGA model and repository implement to provide protection of information, while still allowing researchers timely access to data. Data will be protected via trusted broker methodology. Information is made anonymous by use of randomly chosen keys assigned by the trusted broker. The assignment is made at the clinic-database interface. The coded key will be used to identify experimental data, while providing linkage to the source organism private information in a secure association table.

12 Accountability Access permissions will determine which entities are allowed access to which data. All access to data is tracked via logs. “Audit-readiness” will be maintained to respond quickly to an outside investigation and challenge with the goal of quick clearance. Regular or random internal security audits will be included in a management strategy. Documents used in audits include 24/7 logs, flowcharts of procedures, training documents, incident reports, etc.

13 Future Comprehensive survey and publication of different schemas, architectures, controlled vocabularies/ontologies used by different groups. Comparison of similarities and differences. Digital Content Preservation Planning Study of what determines how well such models are adopted in this environment. Make publicly available the resources to locally, or utilize remotely the digital repositories.

14 End

15 ELSI Lab Clinical Analysis

16

17 mapping Ingest


Download ppt "CCEGA Informatics Working Group Bradley Hemminger School of Information and Library Science."

Similar presentations


Ads by Google