Objective Provide a framework that can be utilized as a tool for the advancement of standardized data presentation
Experimental DesignInstrumentationAnalysisPresentation Sample Procurement Sample preparation Fix/Perm Which Fluorophore Controls Isotype? Single color FMO Appropriate Lasers Appropriate Filters Instrument Settings Lin vs Log Time A, W, H Interpretation Mean, Median % + CV SD Signal/Noise Gating Histogram Dot Plot Density Plot Overlay Bar Graph Data Guidelines
First, Lets address the problem Data analysis incorporates many disciplines including instrumentation, statistics, biology, and photonics. Often times knowledge in one of the above is missing Many different instruments and software packages are available. Historical precedent- o Unfortunately there is a large body of work published with poor data and no clear guidelines
2011 Nature article Some Examples of Poor Data Presentation -Arbitrary and difficult to replicate gate -On axis data difficult to visualize, interpret, and review
-eBioscience product literature -Normal Peripheral blood stained with listed reagents -Thats some bright CD19 and dim CD3 -Ratio between B and T seems off for normal blood Some Examples of Poor Data Presentation
From Nature Medicine, 1998. Human stem cells were injected into NOD/SCID mice and were reported to reconstitute multiple lineages. Myeloid B Cells T Cells… CD4 & CD8 Some Examples of Poor Data Presentation
Medium-to-high FS? Did they backgate to ensure this was the correct gate? Some Examples of Poor Data Presentation
An isotype control for two channels? Which one? (CD45 was on yet a third channel; no control for that?) How was gate actually defined on this control? Impossible to estimate the amount of background staining in this histogram: need a gate to express it! Other graphs are shown as bivariate displays, causing difficulty to translate. % Pos? Some Examples of Poor Data Presentation
Why are cells expressing both markers? If these are myeloid origin, then why is a lymphocyte gate (R1) applied? The cells on the diagonal look like nonspecific staining, and in fact were probably present in the isotype control. Some Examples of Poor Data Presentation
Nearly 100% of cells are expressing CD19. If so, then there is no room left over for other lineages… The data appears self-contradictory. But without percentages, we cannot tell. Some Examples of Poor Data Presentation
Same problem as for myeloid cells: The CD2 + CD3 + cells appear to be non-specifically-stained. The CD4 and CD8 distributions dont look like typical mature T cells… and what about the CD4 + CD8 +. Some Examples of Poor Data Presentation
Why do graphs e and h have so many events compared to graphs d, f, and g? R1 + R2 (2.5%) represents very few events… Some Examples of Poor Data Presentation
FITC and PE appear to be over-compensated. Some Examples of Poor Data Presentation
Critical analysis of this figure shows that it does not support the contentions of the authors. This does not mean that the authors were wrong. Reviewers should have demanded a more rigorous example dataset… but perhaps the reviewers were not FACS experts. Guidelines can educate An Example of Poor Data Presentation: Summary Example 1 Unfortunately, this example is neither unique… nor even uncommon.
The Division of Investigative Oversight, Office of Research Integrity is currently swamped with request for flow cytometry related Research Misconduct Inquiries. Currently a majority of these cases display blatant intentional fraud. However there is a significant trend pushing for flow related guidelines, and the onus on investigators for proper representation of data is growing. Research Misconduct Inquiry
Research Misconduct is defined by law: 42 CFR Sections 50 and 93. Sections 93.103 & 104: Research misconduct is defined as fabrication, falsification, or plagiarism … in reporting research results. Falsification is manipulating … changing or omitting data or results such that the research is not accurately represented in the research record. Misconduct can be committed intentionally, knowingly, or recklessly. Research Misconduct
There is no wrong way to analyze your data Meaning- Investigators are free to choose: Which plot types for display Placement of gates for analysis Which statistics # events to display or collect Which software package to use How many times you reanalyze
There is definitely a wrong way to analyze your data your data Meaning- Investigators decisions can lead to incorrect data generation or interpretation: Inappropriate gates for analysis (lymphocyte gate for CD15 staining, or inconsistent gates) Misleading or inconsistent plots for display Inappropriate controls (e.g. using isotype for gating) Inappropriate number of events collected (too few events for meaningful and accurate statistical comparison)
A set of guidelines for publication of flow cytometry data has been implemented by the Journal of Experimental Medicine All papers submitted for review will be required to comply with the guidelines, with submission of supplementary information, in order to be reviewed. Papers with sophisticated flow cytometric analysis may undergo an independent review to ensure the appropriateness of the analysis and presentation. Implementation of Guidelines by J. Exp. Med.
MIFLowCyt Minimum Information about a flow Cytometry Experiment ISAC Recommendation The fundamental tenet of scientific research is that the published results of any study have to be open to independent validation or refutation. The MIFLowCyt establishes criteria for recording and reporting information about the flow cytometry experiment overview, samples, instrumentation, and data analysis. It promotes consistent annotation of clinical, biological, and technical issues surrounding a flow cytometry experiment by specifying the requirements for data content and by providing a structured framework for capturing information
A consistent presentation style ensures better communication of data to readers and listener Speaking a common language Faster interpretation; understanding nuances Provides a level of confidence that the data has been appropriately generated and analyzed Allows reviewers and readers to focus on the point of the presentation, avoiding distractions from inappropriate or inconsistent presentations Guidelines: Why do we need them?
They will not define how to do science or how to analyze and interpret the data. In most cases, they are not requirements; they simply codify the between the lines information. They will not prevent nor reduce purposeful fraud. They can reduce reckless science. They can reduce confusion and ambiguity within published data Guidelines: What they are NOT
Principles and Guidelines A few examples of the principles and guidelines for data presentation follow. Introduction
Principles and Guidelines Information about the instrument configuration should be provided Different configurations (laser, filters, etc.) can result in very different sensitivities, compensation requirements, etc. Some experiments (for example, fluorescence intensity comparisons across different days) require that the instrument be carefully calibrated. Interpretation of the significance of the results may require knowledge of these procedures. Why: Hardware/Software
Instrument Manufacturer Identify the FACS instrument and software used to collect, compensate and analyze the data. Include Model and Version where more than one exists. Light source Type Wavelength Power Optics- Band pass, Long Pass, 530/30
Instrument Configuration Providing instrument configuration is a delicate balance between providing sufficient information as to be useful vs. providing too much that is not helpful. Instrument configuration can be summarized in three sections: Optical QA/QC Compensation There is no right procedure (but there are wrong procedures for some kinds of experiments). Knowing instrument configuration is necessary to fully interpret data. Hardware/Software
Instrument Configuration: Optical The optical configuration determines what fluorescence measurements were made by the instrument. There are two tables: one for lasers, the other for detectors. FACS core facilities can create these tables and supply them to users Hardware/Software
Instrument Configuration: QA/QC Knowledge of the QA/QC procedures are necessary to understand how data analysis was performed. Do the gates move from experiment to experiment? Are MFI calculations compared between experiments? Is sensitivity equivalent across experiments? Relevant QA/QC procedures can likely be summarized by a limited set of options that authors select from: oNo daily QC (i.e., fire up the instrument and hope that yesterday's settings are close enough) oAlignment using beads: Set the instrument so that the same output fluorescence is observed on each channel every day oSet the instrument up to the same voltages and settings each day (record beads for QA) oSet the instrument up so that unstained cells are in the first decade of fluorescence Hardware/Software
Instrument Configuration: Compensation A very brief description of how compensation was accomplished is all that is needed. What were the controls? (Beads, cells, combinations) Was compensation manual or automatic? What software was used to compensate? Was manual adjustment of compensation necessary? This helps reviewers interpret distributions that they may think are improper compensation. Hardware/Software
Principles and Guidelines Graph axis labels should include (at a minimum) the reagent being measured Interpretation of the graph is much faster; the reader does not have to translate each label. Why: Graphs-General In the case of fluorescent antibodies, both the specificity and the fluorochrome should be indicated. Do not use FL1 or P1 as a label.
Fluorescent Reagent Description What is binding target Reporter (Fluorochrome) Clone name or number Reagent Manufacturer Reagent catalogue number
Principles and Guidelines The number of events displayed in any graph should be indicated The number of events making up a display can impact on the visualization of the display The number of events should be considered when interpreting the precision of the analysis Why: Graphs-General
Annotating Graphs Indicate with a simple number within or near each graphic, or list in the Figure Legend. Consistent use of color helps minimize extraneous text Figure 001.01 Graphs-General Axis labels show both the measurement and the fluorochrome
Scaling or Axis labels Show all parts of the plot axis that indicate the scaling that was used, (Lin, Log, Bi-exponential) Numerical values for axis ticks an be eliminated except when necessary to clarify the scaling.
Principles and Guidelines To convey quantitative representation of subsets from graphical displays, a calculated frequency of gated events must be displayed. The graph itself cannot convey such information. Depending on how many events are displayed, the appearance of a subset may be quite different. The only way to assess the frequency with accuracy is to provide a numerical value. Histograms can provide notoriously misleading information about frequencies. Why: Graphs-General
Graphs Cannot Convey Frequencies Graphs-General 050100150200 ForSc 0 20 40 60 80 100 % of Max 050100150200 ForSc 0 50 100 150 200 250 # Cells Gate Two datasets. What is the representation of large (high forward-scatter) cells? Does the red distribution have more? Figure 001.04
Graphs Cannot Convey Frequencies Graphs-General Which distribution has more cells? Events: 4,922 Figure 001.04 Red Blue
Intensity measurement Explicitly define the statistic applied (mean, median, Geo mean
Principles and Guidelines The choice of smoothing and specific display type is up to the author. Choose whichever graph and display options most readily convey the information needed to interpret the experiments, but be consistent across all graphs within an analysis There is no single best way to display data. Each display type has advantages and disadvantages. However, using different displays in different graphs may mislead readers because of the nuances of emphasis by each graph type. Why: Graphs-General
Principles and Guidelines Whenever gated analyses are performed, an illustration of the gating process should be shown. The way in which cells are gated can dramatically impact the analysis and interpretation, particularly when rare populations are involved. Backgating demonstrates how each gate has impacted the analysis, and can demonstrate that the gating process has not artefactually selected for the subsets being analyzed. The gating tree teaches readers how to analyze data when they do similar experiments. Why: Gating
Principles and Guidelines Unless otherwise explicitly stated, gating is assumed to have been performed subjectively By convention. Why: Gating
Principles and Guidelines The use of control samples to set gates should be shown; the algorithm to place gates should be explicitly defined if it was not subjective Why: Gating In many cases, subjective placement of gates is a reasonable way to analyze the data; interpretation will not be affected by minor relocations of the gate. However, some types of analysis require rigorous placement of gates to provide the most significant data. If gate placement was algorithmic, then it must be described and shown.
Gate Placement Algorithms Purely subjective Illustration is always useful. Unlikely to be acceptable for quantitative fluorescence measurements, identification of dimly-expressing subsets; discrimination between overlapping subsets. Based on control stains (unstained, FMO, etc.) The control sample must be shown, along with a description of how it was used to place the gate. If the gates move for different types of samples (e.g., treated vs. untreated), then at least one example of each should be given. Objective algorithm. Detail the algorithm (e.g., Top 2% of events; Autogate defined by software). Gating
Experimental and Sample Information How were cell suspensions prepared o Specific proteases o Filtration o Lysing agents o Fix/Perm reagents
In addition to ensuring that primary data presentation conforms with the guidelines, authors will also be expected to submit a single additional supplementary section devoted to the flow cytometry. This section will include: Table of instrument information (template provided online) Gating tree example(s) Gating control(s) Additional analyses pertinent to the interpretation of the flow cytometric data Implementation of Guidelines by J. Exp. Med.
Prefetto et al 2006 JIM Keeney et al 1998 Cytometry References Cytometry 30(5), 1997 MIFLowCyt 1.0 http://ucflow.blogspot.com/2011/04/display- transformation-and-flowjo.htmlhttp://ucflow.blogspot.com/2011/04/display- transformation-and-flowjo.html (bi-exponential display) Cytometry A 783A:384-385 Seventeen-colour flow cytometry: unravelling the immune system Stephen P. Perfetto, Pratip K. Chattopadhyay & Mario Roederer Nature Reviews Immunology 4, 648-655 (August 2004)