Presentation on theme: "Introduction to network of networks King’s College, Cambridge Oct 6-7, 2005."— Presentation transcript:
Introduction to network of networks King’s College, Cambridge Oct 6-7, 2005
Major postulated problems of human genome epidemiology Small sample sizes Small effect sizes Large number of biological factors Old-epidemiology problems: confounding, misclassification Questionable replication validity
Background issues Assay development Standardization Independence Diagnostic and predictive performance Validation Clinical use Integration in clinical care Cost-effectiveness
Small sample sizes
Small effect sizes
Non-replicated diminishing effects
The other side: don’t give up early
H: heterogeneity R/F: difference in first vs. subsequent D1-D3: publication bias diagnostics RS/FS: significant findings (with/without first studies)
Phenotypes: the Lernean Hydra? Definition of endpoint: spirometry (various measures), clinical outcome, clinical score, use of rescue medication, medication dose Definition of genetic contrast Definition of intervention: short-acting, long- acting, combinations, selected groups, comparison of various groups Definition of timing: after the first dose, after the last dose, between first and last, various differences Use or not of baseline contrasts
Racial (or other subgroup) differences? Empirical evidence suggest that while allele frequencies differ a lot (I-squared≥75%) in 58% of postulated gene-disease associations, differences in the effect sizes (odds ratios) occur in 14%. No differences in race-specific odds ratios have been recorded once we have exceeded a total sample size of N=10,000
Control rates: I 2 ≥75% in 58% Odds ratios: I 2 ≥75% in 14%
Readily available, available, hidden, and very well hidden data
A solution (?): investigator or data specimen registration Upfront study registration has been adopted for randomized clinical trials, as a means for minimizing publication and reporting biases and maximizing transparency For molecular research, upfront registration in public of all ideas is counter-intuitive and goes against the individualistic spirit of discovery in basic research Instead one could aim for registries of investigators and data specimen collections
Registries of data/sample collections Inclusive networks of investigators working on the same disease, set of genes or field Promotion of better methods and standardization Research freedom for individual participating teams Thorough and unbiased testing of proposed hypotheses with promising preliminary data on large-scale comprehensive databases Due credit to investigators for both “positive” and “negative” findings
Registries of teams The core registry should comprise information on the teams that already participate in a network A wider registry should also record all other teams that work on the same field. Depending on the structure and funding opportunities of the existing networks, additional teams may be allowed to join formally and fully in the original network; even if structure or funding considerations do not allow this, additional teams should be simply recorded, so that a picture of the field-at-large is available Networks may have qualitative or other pre- requisites for allowing teams to join. These should be developed by the scientists involved, but some central guidance and sharing of experiences would also be useful
What would a network of networks do Communication and sharing of expertise in statistical analytical methods, laboratory techniques, practical procedures, logistics of creating and maintaining a network Co-ordination of registries, facilitation and avoidance of overlap Maximization of efficiency and standardization of methods and procedures Electronic list of all registries containing minimal information on all participating teams as well as on non- participating teams Eventually keeping updated an “Encyclopedia” of validated molecular information that may be compiled by investigators of each network for the disease/field-at hand
Types of networks Disease-based Gene-based Exposure-based Based on combination of the above
Questionnaires First planning questionnaire: probing the possibility of building registries of teams Second questionnaire: description of experiences and practices in building and maintaining consortia, including setting up, scientific approach, standardization within the consortium, and other organizational issues
Initial targets To be able to identify 4 teams that could create registries To share as much possible information on experiences in building and maintaining consortia To translate this information to added value for the participating consortia
Responses Practically all consortia have some registry of their teams, one has already contributed this registry in detail, several others refer to their web lists. Will need to discuss more on how many can create also the field-wide registry Twenty consortia replied on the second questionnaire
1. Creation of a core registry of investigators/teams and data collections that participate in each network already – information should include some minimal information that should be agreeable to all networks. Please let me know if you can provide information for each participating team on: 1. contact investigator name YES, immediately available 2. address/ YES, immediately available 3. type of data collection (case/control & cohort) YES, See table attached (as yet incomplete, however under the process of circulation around the forming consortium) 4. total sample size of database (cases/controls split, if pertinent) YES, immediately available 5. location YES, immediately available 6. representative publication(s) (if any) YES, immediately available 7. other readily available information (please specify) IF APPLICABLE For each item above, if the answer is YES, please specify how long it would take you to compile the pertinent information, or whether it is already compiled and, if so, how. 3 months 2. Creation of a wider registry of investigators/teams and data collections working in the same field, but who are not participating in the network already Please let me know if you can provide information for each non-participating team on: 1. contact investigator name YES, some personal contacts immediate. Others, some delay. Advice would be of value in defining criteria for the inclusion of other groups for this category. This is a general comment for this section of the questionnaire. Is there a requirement to mention all other groups other than those directly included in the core network? 2. address/ YES, some personal contacts immediate. Others, some delay 3. type of data collection (cohort) YES, some personal contacts immediate. Others, some delay 4. total sample size of database (cases/controls split, if pertinent) YES, some personal contacts immediate. Others, some delay 5. location YES, some personal contacts immediate. Others, some delay 6. representative publication(s) (if any) YES, immediate 7. other readily available information (please specify) IF APPLICABLE The list of additional investigators should initially be compiled through searches in PubMed (and/or the HuGE Net database which is largely derived from PubMed). Personal contacts and other indirect info is also welcome, but this will continue to be added along with updates from electronic databases. For each item above, if the answer is YES, please specify how long it would take you to compile the pertinent information, or whether it is already compiled. Moreover, please specify if you already have the people to do this, or you would need help (and if so, what help exactly would best suit your needs). 6 months; personnel contacts will allow the availability of information to be more rapid.
Questionnaire B This is an open-ended questionnaire and we ask each network leader/representative to share this information in anticipation of the Cambridge meeting. We want to collect information on experiences in building and maintaining consortia. We are interested to hear about practices that have worked as well as practices that have not worked. We are particularly interested in hearing about problems that have ensued along with their solutions or lack thereof. There is no limit on the length of the response to each question/section.
Setting up Getting started and launching a network: please describe how and when the network was started and what the problems were, if any, at that phase. Please discuss also if there have been any expansions or shrinkage or whether the same teams have participated all along. Organization and steering, coordinating centers: please describe how the network is organized and steered and what are the responsibilities and powers of the steering organs. Please mention if there are specified statistical, genetics, clinical, or other coordinating centers and what their responsibilities and power are. Funding: please describe what funding the network has received to-date. Please mention any problems encountered with funding.
Scientific approach Selection of project targets: please describe how you decide on selecting your project targets. What are the criteria that count heavily in selecting your priorities among potentially hundreds (and now thousands) of gene or other targets? Prospective and retrospective components: please describe whether the network conducts work on existing data retrospectively or collects new information on genotypes, phenotypes and/or other measurements or whether both approaches and/or mixtures thereof have been followed. Handling of published or other information from teams not participating in the network: please describe whether your analyses include information from teams that do not participate in the network but are nevertheless working on the same field. Please describe if there is any effort to retrieve unpublished/missing information.
Standardization within the consortium Data flow: please describe the data flow in your network. Any details on quality assurance practices and checks for logical errors would be very useful. Standardization of phenotypes and genotypes: please describe what measures are taken, if any, to ensure standardization of genotypes, phenotypes, and other measurements across participating teams. Please mention specific problems that may have arisen.
Other organizational issues Communication of results in the team, review processes, publication policies: please describe how results of analyses are communicated to the team, what the review processes are and whether you have an explicit publication policy. If so, please describe. Please describe how authorship is determined. Web site development and implementation: please describe if the network has a web site and, if so, how it has been developed and implemented, what it contains (briefly), and whether there are plans for major expansions. Other: Please mention any other practices that have worked very well in your network and that you would recommend for other networks as well as major problems that have appeared that are not covered in the sections above.
How many teams Existing networks include anywhere between 5 and 521 teams In terms of sample size they include anywhere between 3 and 200 thousand subjects
Stage of development Setting up stage Established to run specific project(s) Established, extended to more projects, on a per project basis Established as consortia with funding for the consortium regardless of specific projects
Organizational issues Expansion common; no mention of shrinkage Several models of steering and co- ordination, tailored to dynamics of network; working groups within the largest network Various funding sources; perceived to be a challenge for some consortia
Scientific approach Biological plausibility and other lines of evidence Phenotypes typically already available, but genotyping either prospective or retrospective Usually published data outside the consortium are not considered (as of yet)
Standardization and other issues Local vs. centralized data flow, occasional web facilitation Various QC practices Precise vs. flexible (“vague”) publication policy, being inclusive
Few problems acknowledged Funding, especially regular funding to the infrastructure Timing for making datasets available, especially for second-generation projects Infrastructure development at early stages Heavy, time taking administration Keeping track of what is going on in different countries and teams Public policy and funding bodies modifying the goals
Summary points with interesting prospects IRB – handling samples in different areas/countries Opportunities for young researchers Who reviews grants and manuscripts Incorporating newer molecular technologies to the group Large-scale, whole-genome approaches Integration of informatics, web-based, interactive approaches Role of the industry Can NoN push for more funding? Funding for infrastructure vs. funding for research NoN to promote visibility for “success stories” derived from collaborative work in the field
… and more… Is there a gene variant that is so highly credible that is not worth doing an effort to validate at a consortium level? Threshold for inclusion of smaller and questionable quality studies Standardization of phenotypes and exposures, how much variability allowed Quality check for genotypes Identifying coding errors based on haplotype construction Meta-analysis of microsatellite data and multiallelic data, including haplotypes: published, published-re-examined, de novo re-genotyping, prospective Pooling retrospective data, as is, with quality checks based on published information, based on re-analysis, based on (partial) re-genotyping, central QC
…and more… Consortia as instruments for improving the quality of primary studies Consortia as instruments for starting high quality studies also in countries where it is more difficult to do research (developing countries) Create an acceptable grading of evidence scale Selecting targets from published data, metabolic pathways, confirmed biomarkers, linkage-identified monogenetic findings, replicated whole-genome association studies Monographs and user’s guides on how to design, conduct and report genetic epidemiology studies and meta- analyses thereof (both retrospective and prospective)
… and more… Publication of null studies that are well designed – single studies (any visibility), negative results of consortia (high visibility) Inclusiveness in registers and definition of unit/team/investigators to be registered Filtering of eligible teams per project according to various criteria Overlap between consortia Maintaining plurality, several consortia operating in the same field Evolving nature of consortia precluding full standardization of registration information across teams
How about Putting slide presentations on website Split to teams working on specific themes (e.g. standardization, registries, etc) Linking all consortia websites with 1-page info to HuGENet Refer to registries of consortia in publicly available sites Create and web-link wider registries of all teams working in a field Write manuscript(s) based on what was discussed these 2 days Agree on grading levels of evidence Pilot themes for an evidence-based encyclopedia of human genome epidemiology