Presentation is loading. Please wait.

Presentation is loading. Please wait.

Feature Extraction Software Training Insert AE Name and Date.

Similar presentations


Presentation on theme: "Feature Extraction Software Training Insert AE Name and Date."— Presentation transcript:

1 Feature Extraction Software Training Insert AE Name and Date

2 Page 2 Feature Extraction 7.5 Customer Training Updated 08/18/04 Agenda Product structure and pricing Downloading and shipping of software and manual Installing Feature Extraction License types and redemption Activating FE software Data Workflow Features and Benefits of Feature Extraction software User Interface Grid Mode for Agilent and 3 rd party microarrays Feature Extraction algorithms Feature Extraction results Known issues fixed in FE 7.5.1

3 Page 3 Feature Extraction 7.5 Customer Training Updated 08/18/04 Product Structure

4 Page 4 Feature Extraction 7.5 Customer Training Updated 08/18/04 Product Structure

5 Page 5 Feature Extraction 7.5 Customer Training Updated 08/18/04 Downloading FE Software and Manual FE software and manual are available for download from: Agilent website eRoom - Gene Expression Informatics Software > Fe_cd7.5.1 https://teamspace.agilent.com/eRoom/CAG2/ScannerSupport/0_52b2b EPI Warehouse LSM website

6 Page 6 Feature Extraction 7.5 Customer Training Updated 08/18/04 Shipping of FE Software and Manual FE software and hard-copy manual will be shipped to: New scanner orders originating on or after June 16 Upgrade service contracts originating on or after June 16 Existing FE customers will NOT receive FE installation CD or hard-copy manual in the mail; Must download from Agilent website

7 Page 7 Feature Extraction 7.5 Customer Training Updated 08/18/04 Installing Feature Extraction Before installing FE 7.5.1, make sure that previous version of FE is uninstalled completely Delete all associated.dll files Un-installation of FE will NOT delete the existing license file After FE is installed, software will recognize and use the FE license Compatibility and known issues with FE Internal tool version is compatible with FE Concurrent (multiple) sessions of FE is NOT supported on same PC

8 Page 8 Feature Extraction 7.5 Customer Training Updated 08/18/04 License Types Two types of licenses available: Node-locked licenses User must install Feature Extraction software on specific PC User must provide the host ID of PC that FE s/w is installed on 30-day demo licenses Software is fully functional but expires 30 days after date of issue Run on any PC FE uses the SAME license file that FE used This means free upgrade for existing FE customers!!! Customers using beta versions of FE 7.4.x will need to upgrade to FE by July 15 when the beta software self-inactivate

9 Page 9 Feature Extraction 7.5 Customer Training Updated 08/18/04 License Redemption License key is redeemed at Agilent website https://software.business.agilent.com https://software.business.agilent.com Also, from FE 7.5 menu, click Help > Agilent License User must provide Agilent with the following: Order Number (available on Software Entitlement Certificate) Certificate Number (available on Software Entitlement Certificate) Host ID (available in FE 7.5, Help > About Analysis) address where the license will be sent to

10 Page 10 Feature Extraction 7.5 Customer Training Updated 08/18/04 Access Host ID in Feature Extraction Software Click Help > About Analysis About Analysis dialog displays Software version Host ID (MAC/Ethernet address)

11 Page 11 Feature Extraction 7.5 Customer Training Updated 08/18/04 Where to Save the License Key License file name ends with “.lic“ License file should be saved in this directory: Program Files\Agilent\MicroArray

12 Page 12 Feature Extraction 7.5 Customer Training Updated 08/18/04 Activating Feature Extraction Software Software needs to be activated after it is installed and the license file is saved in directory Program Files\Agilent\MicroArray To activate Feature Extraction software, open an image file FLEXlm License Finder dialog pops up asking for license file Select “Specify the License File” and browse to the directory where you have saved the license file Program Files\Agilent\MicroArray

13 Page 13 Feature Extraction 7.5 Customer Training Updated 08/18/04 TIFF Image Analysis Feature Extraction JPEG Text GEML QC File (print file) Pattern File Scanner software Rosetta Resolver™ or Luminator™ Shape MAGE-ML Data Flow Feature Extraction Result Files:

14 Page 14 Feature Extraction 7.5 Customer Training Updated 08/18/04 Features and Benefits of Feature Extraction Software

15 Page 15 Feature Extraction 7.5 Customer Training Updated 08/18/04 What You Can Do with Image Analysis Tool Visualize spots on microarray Change color and scale of image Flip and rotate image from landscape to portrait mode and vice versa Interactively position grids to find spots on microarrays Compare nominal spot centroid laid down by grid with centroid position for the spot Move centroid position to where you want it on the spot Select spots to ignore – these won’t be used in Feature Extraction Create histogram and line plots View visual results and outlier flags for features & backgrounds

16 Page 16 Feature Extraction 7.5 Customer Training Updated 08/18/04 What You Can Do with Feature Extraction Algorithms Find Spots – positions a grid and finds centroid positions of spots Spot Analyzer – removes outlier pixels and defines pixels for features & local backgrounds Poly Outlier Flagger – flags features and backgrounds that are non-uniformity outliers and population outliers Background Subtraction – corrects for the background and determines if background-adjusted signal is positive and significant from background Deletion Control (25mer in-situ) – corrects for cross-hybridization Dye Normalization – selects features for dye bias evaluation and corrects for dye bias Ratio – calculates log (rProcessedSignal/gProcessedSignal), log ratio error, and p-value of log ratio for each feature

17 Page 17 Feature Extraction 7.5 Customer Training Updated 08/18/04 Feature Extraction User Interface

18 Page 18 Feature Extraction 7.5 Customer Training Updated 08/18/04 User Interface Display panels Image Info (available when image is loaded) Single channel display Grid Definition (available when grid mode is on) Maximun Fit Movements for subgrids and spots Grid Adjustment (available when grid mode is on) Spot location (col, row) Spot center Ignore spot – The user can select spots that are to be ignored from analysis. No data including feature number, row, column information will be displayed in feature extraction output file.

19 Page 19 Feature Extraction 7.5 Customer Training Updated 08/18/04 Toolbar Buttons for Grid Mode Crop Mode On/Off On = Crop Off = Zoom Grid Mode On/Off Adjust Main Grid Adjust Subgrid Skew Subgrid Preview Spot Centroids Undo and Redo The Edit menu also has these options Grid Mode Adjust Main Grid Adjust Subgrid Adjust Spot Skew Subgrid Preview Spot Centroids

20 Page 20 Feature Extraction 7.5 Customer Training Updated 08/18/04 Zooming In and Out Cropping button in the OFF mode is used for zooming in on any boxed area Toolbar buttons Click View > Zoom, then select a magnification Mouse shortcuts To zoom in - Ctrl + left double click on the image To zoom out - Ctrl + right double click on the image Zoom inZoom out100 percent

21 Page 21 Feature Extraction 7.5 Customer Training Updated 08/18/04 Tools Menu Tools > Flip Upper Left to Lower Right (Landscape/Portrait) Tools > Preferences (to set default options) Image View tab – Set initial window size, option to start with crop mode, image color, data range of image display, and more Grid Mode tab – Start grid mode with a gene list type, default view zoom setting, and maximum fit movements for grid adjustment Feature Extraction tab – Search for grid file or design file first when analyzing Agilent microarray, default save directory for result files, option to save log file, and FTP settings to send result files to Resolver and Luminator Graph View tab – Histogram bin size and bin number General tab – Hyperlink to Agilent web site

22 Page 22 Feature Extraction 7.5 Customer Training Updated 08/18/04 Demo User Interface

23 Page 23 Feature Extraction 7.5 Customer Training Updated 08/18/04 Using Grid Mode to Analyze Agilent and Non-Agilent Microarrays

24 Page 24 Feature Extraction 7.5 Customer Training Updated 08/18/04 Grid Mode Analysis Ability to grid and feature extract Agilent and non-Agilent microarrays scanned on Agilent scanner New spot finding tool allows the user to interactively position and find spots on microarray Accepts annotation and array layout information via the following gene lists Agilent grid files Agilent design files GAL files (GenePix Array Layout) Tab-delimited text files

25 Page 25 Feature Extraction 7.5 Customer Training Updated 08/18/04 Setting Up an Initial Grid Grid file is required to feature extract non- Agilent microarrays Agilent microarrays can be feature extracted with grid file, if desired To create a grid, click on Grid Mode On/Off icon to select a gene list type to grid _grid.csv, gal, xml, tab text no gene list Grids are saved as two files (_grid.csv, _feat.csv) and can be used to grid and analyze other arrays of same layout Recommend to save grid file in same directory as image – this is where the software looks at when it needs a grid file

26 Page 26 Feature Extraction 7.5 Customer Training Updated 08/18/04 Benefits to using Grid Mode If no gene list is available, a grid can be created de novo If a gene list is selected, Feature Extraction uses the layout information and annotation from the gene list to grid the microarray Users can interactively adjust the main grid, subgrids, spots, and preview spot centroids Users can select spots to ignore from analysis Grid files can feature extract images that are too rotated and therefore cannot be used with Agilent design files (*.xml) Grid Mode Adjust Main Grid Adjust Subgrid Adjust Spot Skew Subgrid Preview Spot Centroids

27 Page 27 Feature Extraction 7.5 Customer Training Updated 08/18/04 Demo How to Grid a Microarray

28 Page 28 Feature Extraction 7.5 Customer Training Updated 08/18/04 Set Design File Search Path (Agilent Microarrays Only) Feature Extraction checks the design file search path to find the microarray design file Click Tools > Preferences > Feature Extraction > Design File Search Path Click Browse in the “Configure Directory Path for finding Design Files” dialog box Locate the directory containing the design file in the “Browse for Folder” dialog box Click Add in the “Configure Directory Path for finding Design Files” dialog box

29 Page 29 Feature Extraction 7.5 Customer Training Updated 08/18/04 How to Manually Specify a Grid/Design File If Feature Extractor can not find a grid file or design file, then the “Load Grid/Design File” dialog box appears Browse for grid or design file and then click the “Load” button To avoid having to manually specify a grid or design file, do the following: Select the preference of search order for Agilent grid or design file If a design file is to be used, make sure the design file search path has been properly set If a grid file is to be used, make sure the grid file is saved in same path as TIFF image file

30 Page 30 Feature Extraction 7.5 Customer Training Updated 08/18/04 Feature Extraction Input Files Input Files - Required TIFF image of Agilent and non-Agilent microarray scanned on Agilent scanner Array design file – describes the layout of probes and probe annotation Design file (.xml) – for Agilent microarrays Grid file (_grid.csv, feat.csv) – for Agilent and non-Agilent microarrays Input Files - Optional Printing File (cDNA microarrays only) – Contains cDNA clones that failed printing QC and are to be ignored from analysis. Location of printing file needs to be set in Design File Search Path

31 Page 31 Feature Extraction 7.5 Customer Training Updated 08/18/04 Feature Extraction Output Files GEML – Expression data in the GEML 1.0 format Can be exported to Rosetta Resolver and Luminator software MAGE – Expression data in the MAGE-ML format Can be exported to Rosetta Resolver 4.0 and future version of Luminator JPEG – Compressed version of image file in JPEG format Tab-delimited text – Expression data in tab-delimited text format Visual Result – “Shapes” annotation generated by and viewed in Feature Extraction Shows the feature size, local background region, raw signals, log ratio, gene name, non-uniformity and population outlier flags Allows for subsequent viewing of the “shapes” annotation without having to re- extract the scan image

32 Page 32 Feature Extraction 7.5 Customer Training Updated 08/18/04 MAGE-ML Result Files Feature Extraction result files can be saved in MAGE-ML format Microarray Gene Expression Markup Language (MAGE-ML) is a language designed to describe and communicate information about microarray based experiments MAGE-ML is based on XML and can describe microarray designs, microarray manufacturing information, microarray experiment setup and execution information, gene expression data and data analysis results. A format accepted by major public microarray databases such as ArrayExpress (EBI)and GEO (NIH) ArrayExpress GEO

33 Page 33 Feature Extraction 7.5 Customer Training Updated 08/18/04 Exporting Files to Resolver/Luminator (Intranet) FTP transfer of files GEML or MAGE-ML results TIFF or JPEG images FTP settings Destination: enter name where Resolver or Luminator resides FTP port: enter FTP port # User name: enter user’s name Password: enter password

34 Page 34 Feature Extraction 7.5 Customer Training Updated 08/18/04 Demo Feature Extraction and the Algorithm Modules

35 Page 35 Feature Extraction 7.5 Customer Training Updated 08/18/04 Barcode, design ID, filename Array dimensions, array pattern, feature size, feature layout, probe names, etc. QC information, flagged features Running Feature Extraction

36 Page 36 Feature Extraction 7.5 Customer Training Updated 08/18/04 Default Parameters in Feature Extraction Modules Default parameters are loaded based on type of microarray, design or grid file, and settings changes saved during a run Default check boxes are marked (on) or (off) Default radio buttons are marked (*) Default numbers are displayed in parentheses Default parameters are only recommended when Agilent’s complete system is used (i.e. Agilent labeling and hybridization protocols, Agilent microarrays, and Agilent scanner) If there is any deviation from Agilent’s complete system, users need to carry out experiments to fine tune the parameters If parameter numbers appear in red, it means that they are different from the values optimized for the Agilent microarray system Users need to carry out experiments to optimize these values for their microarrays and protocols

37 Page 37 Feature Extraction 7.5 Customer Training Updated 08/18/04 Where can you find the default parameters

38 Page 38 Feature Extraction 7.5 Customer Training Updated 08/18/04 Feature Extraction Algorithms

39 Page 39 Feature Extraction 7.5 Customer Training Updated 08/18/04 FindSpots Algorithm – Grid Initialization Locates all spots on microarray Finds corner spots for grid placement Places initial or nominal grid based on location of corner spots, spot size and inter-spot distances obtained from the design file Finds bright spots (based on high intensity) Adjusts grid according to location of bright spots Finds dim spots by interpolating location from adjusted grid

40 Page 40 Feature Extraction 7.5 Customer Training Updated 08/18/04 FindSpots – Deviation Limit Dev Limit restricts how far a spot can deviate from the nominal grid position and still be called “found” Default deviation limit is automatically loaded User can change default deviation limit between 0-70 microns (Agilent arrays) Setting deviation limit too low can cause spots to be missed Setting deviation limit too high can cause spots in adjacent rows and columns to be swapped.

41 Page 41 Feature Extraction 7.5 Customer Training Updated 08/18/04 SpotAnalyzer Algorithm Determines which pixels represent the spot and the local background Spot size is determined by CookieCutter or WholeSpot method Optional: calculate spot size Local background area is determined by the radius distance Rejects outlier pixels in spot and local background based on Standard Deviation or Inter Quartile Range method (Default; more robust) Flags feature as saturated if > 50% of pixels remaining after outlier rejection have intensities above 65502

42 Page 42 Feature Extraction 7.5 Customer Training Updated 08/18/04 SpotAnalyzer – Spot Size and Spot Analysis Methods Spot size is calculated when enabled in the UI Spot size determines the number of pixels that are chosen to represent a feature The spot size is reported with the final results as "SpotRadiusX" and "SpotRadiusY" Spot analysis methods use the spot size to define features For CookieCutter method Spot size is obtained from the XML design file or the calculation that user selects from SpotAnalyzer tab For WholeSpot method Spot size is obtained from spot size calculation that user selects from the SpotAnalyzer tab

43 Page 43 Feature Extraction 7.5 Customer Training Updated 08/18/04 SpotAnalyzer – What Defines Spot Size and Local Background CookieCutter WholeSpot

44 Page 44 Feature Extraction 7.5 Customer Training Updated 08/18/04 SpotAnalyzer – Determination of Local Background Radius Minimum local background radius (Default) Adjusted local background radius (Max of n = 4) Where n is minimum of 1 to maximum of 4 sets of closest neighbors n = 1 has at least 8 nearest neighbors n = 2 has at least 24 nearest neighbors n = 3 has at least 48 nearest neighbors n = 4 has at least 80 nearest neighbors 24 nearest neighbors (n = 2) 2 Self Maximum radius Example: CEILING [3.2] = 4

45 Page 45 Feature Extraction 7.5 Customer Training Updated 08/18/04 SpotAnalyzer – Pixel Rejection Based on Standard Deviation Pixel outlier rejection for features and backgrounds in both colors +/- 2 SD, encompasses ~ 95% distribution Feature intensity is mean signal of inlier pixels

46 Page 46 Feature Extraction 7.5 Customer Training Updated 08/18/04 SpotAnalyzer – Pixel Rejection Based on Inter Quartile Range Interquartile Range (IQR) is range of intensities under Gaussian distribution between the 25 th and 75 th percentile Pixels of feature and background are rejected if ~ 99 % of the distribution encompassed between the lower and upper rejection boundaries, when using 1.42*IQR

47 Page 47 Feature Extraction 7.5 Customer Training Updated 08/18/04 PolyOutlierFlagger Algorithm Flags features and backgrounds as non-uniformity outliers based on statistical deviations from Agilent noise model: Polynomial Variance Model – expected variances from array manufacturing, wet lab chemistry, and scanner noise Flags feature and background as population outlier using: IQR Method – using intra-array replicate features and the associated background areas

48 Page 48 Feature Extraction 7.5 Customer Training Updated 08/18/04 PolyOutlierFlagger – NonUniformity Outlier Expected VarianceMeasured Variance x is mean signal of feature or background minus minimum signal feature or background on array A (Gaussian) – variance estimated from labeling and feature synthesis B (Poisson) – variance estimated from scanning measurement or counting error C (Constant) – variance expected from electronic scanner noise and glass background noise n = # inlier pixels in feature or background X = raw pixel intensity in feature or background X bar = raw mean signal of feature or background Feature or background is flagged as non-uniformity outlier if: where CI is confidence interval calculated from chi square distribution

49 Page 49 Feature Extraction 7.5 Customer Training Updated 08/18/04 PolyOutlierFlagger – Population Outlier Performs population statistics on features and background areas if microarray has the minimum number of replicate features Feature or background is flagged as population outlier if: ~ 99 % of the distribution encompassed between the lower and upper rejection boundaries, when using 1.42*IQR

50 Page 50 Feature Extraction 7.5 Customer Training Updated 08/18/04 PolyOutlierFlagger Pink triangles are features flagged as NonUnifOL

51 Page 51 Feature Extraction 7.5 Customer Training Updated 08/18/04 PolyOutlierFlagger Non-Uniformity Outliers indicated by : Colored inner ring (Feature) or colored outer ring (Local_BG) Feature appears “uniform”… Try changing color scales, or, looking at single-channel window

52 Page 52 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSub Algorithm Estimates and corrects for systematic biases in data arising from: Substrate fluorescence Non-specific binding to substrate Possible biases introduced during scanning Artifacts from hyb and wash Determines if feature signal is significant compared to background Spatial detrend to correct for Adjusts background globally (to a user- defined value) to correct for under or over estimation of the background

53 Page 53 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSub - Background Subtraction Methods No background subtraction This method does NOT subtract the background signal from the feature signal Feature raw signal (MeanSignal) is passed on to spatial detrend (if turned on) If “no background subtraction” method is selected, then by default, the background is not adjusted globally Local Method Local background (Radius method) Global Methods: Average of all background areas Average of negative control features Minimum signal (feature or background) Minimum signal (feature) on array ~ simulated negative control

54 Page 54 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSub - Order of Background Correction Analysis flow for background correction is in this order: Background subtraction method Spatial detrend, if it is turned on Feature significance test Adjust background globally, if it is turned on Feature signal is passed on to and processed by next method that is available

55 Page 55 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSub - Spatial Detrend Algorithm Decreases the contribution of any systematic signal gradient on the array to the “foreground” signal Estimates the surface of the “foreground” signal by picking dimmest 1-2% of the array feature intensities “Foreground” signal is the portion of feature signal that is not related to intended signal from dye-labeled target complementary to the probes on the feature SpatialDetrendSurfaceValue is determined for each feature per channel

56 Page 56 Feature Extraction 7.5 Customer Training Updated 08/18/04 The Problem – Differential Expression Gradient FE – Default Parameters Up-regulatedDown-regulated

57 Page 57 Feature Extraction 7.5 Customer Training Updated 08/18/04 The Cause – Differential Expression Gradient Regional variations in the “foreground” are present on the microarrays In previous version of FE (v.7.1.1), the background subtraction method did not adequately measure these variations Background was underestimated in some regions of the microarray Consequently, log ratios and differential expression calls were inaccurate

58 Page 58 Feature Extraction 7.5 Customer Training Updated 08/18/04 The Approach – New Background Estimation Method FE default is local background subtraction Estimates non-specific signal on feature based on intensity of area between features FE default is no background subtraction with spatial detrend Estimates non-specific signal based on the dimmest 1% of feature intensities. This baseline is estimated regionally to account for variation.

59 Page 59 Feature Extraction 7.5 Customer Training Updated 08/18/04 Spatial Detrend – Estimating Foreground Intensity A “FilteredSet” of features are identified in process known as Low Pass Filter Features with dimmest 1% of feature intensity per window are selected If “no background subtraction” option is selected, then feature intensity is raw mean signal. If a background subtraction option is selected, then feature intensity is background subtracted signal. Window size is 10 columns x 10 rows of features Window is moving horizontally and vertically on array by increment of 5 Foreground surface is estimated from the “FilteredSet” of features (i.e. features with dimmest 1% feature intensity per window) 2-D Loess algorithm fits a smooth surface through the “FilteredSet” of feature intensities using 20% nearest neighborhood filtered points For features NOT in the “FilteredSet”, a 2-D Loess algorithm with similar neighborhood size of filtered points is used to predict surface value for each feature Lastly, SpatialDetrendSurfaceValue is subtracted from MeanSignal (or BGSubSignal, if BG subtraction is selected) for each feature

60 Page 60 Feature Extraction 7.5 Customer Training Updated 08/18/04 Identify Features in FilteredSet by Low Pass Filter Default: Window = 10, Increment = 5, Percentage = 1

61 Page 61 Feature Extraction 7.5 Customer Training Updated 08/18/04 Low Pass Filter Schematic – Effect of Moving Window on Sampling Moving Window No Moving Window

62 Page 62 Feature Extraction 7.5 Customer Training Updated 08/18/04 Low Pass Filter Schematic Features from Low Pass Filter – Raw Green Intensities

63 Page 63 Feature Extraction 7.5 Customer Training Updated 08/18/04 2D Loess Fit – Estimate Foreground Surface to All Features gMeanSignalgSpatialDetrendSurfaceValue

64 Page 64 Feature Extraction 7.5 Customer Training Updated 08/18/04 The Solution – No Differential Expression Gradient FE Default Parameters Up-regulatedDown-regulated

65 Page 65 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSub – Feature Significance and Well Above BG Feature Significance Test Calculates significance of feature signal vs background signal (local or global) using: 2-sided Student’s t-test (implemented as an incomplete Beta Function approximation) Feature gets Boolean flag of 1 under the IsPositiveAndSignif column (in FE result file) if the calculated p-value is less than the user-defined max p-value Well Above Background Test If background-subtracted signal is “well above” background as calculated by the equation below: And the feature passes the IsPositiveAndSignif test, then feature gets Boolean flag of 1 under the IsWellAboveBG column in Feature Extraction result file

66 Page 66 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSub – Global Background Adjustment Background subtraction errors arise from inaccuracies in background estimation Basic ideas behind global background adjustment Adjusts for over or under-estimation of background in one channel over the other channel Corrects “hook” effect at the low-end intensity scale Applies background correction using curve fitting method to adjust the initial background-subtracted intensities

67 Page 67 Feature Extraction 7.5 Customer Training Updated 08/18/04 Adjust Background Globally to User-Specified Value Global Background Adjust algorithm is same as in FE7.1.1: Evaluates background-subtracted signal and finds a rank consistent set of features with low signal Finds a constant in both channels that moved the median of these signals to zero In FE 7.5.1, user can enter a constant value between 0 to 500 to “pad” all feature signals to that value This will have the effect of compressing log ratios, but will decrease the variability (SD) in the log ratio between inter- and intra-array replicates. Note: The “pad” is an exploratory tool for variance stabilization. Customers are advised NOT to use the “pad” for production purposes. Reference: “Transformations…What For…Which One” by W. Huber

68 Page 68 Feature Extraction 7.5 Customer Training Updated 08/18/04 How is Adjust Background Globally Value Used If red signal vs green signal plot has a slope of rank consistent features > 1, then “pad” value chosen by user is assigned to green channel Pad value = 50 and slope = 1.2 Value of 50 is added to the green background-subtracted signal all features Value of (50*1.2) = 60 is added to the red background-subtracted signal of all features Pad value = 50 and slope = 0.5 Value of 50 is added to the red background-subtracted signal all features Value of (50/0.5) = 100 is added to the green background-subtracted signal of all features

69 Page 69 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSub – Features Used for Global BG Adjustment Select a suitable subset of the entire dataset (probes) for applying the global adjustment algorithm  Select Features w/ no or negligible differential expression (i.e. Rank Consistency Filter - Features along the central tendency line (red) of the distribution) Basic filters for feature selection: 1. Control type = 0 2. Non Population Outlier 3. Non Non-Uniformity Outlier 4. Pass Rank Consistency Filter r(MeanSignal) g(MeanSignal)

70 Page 70 Feature Extraction 7.5 Customer Training Updated 08/18/04 Feature Number Intensity_R I R Rank_R  R Intensity_G I G Rank_G  G R G Identify Features along the Central Tendency Line – Rank Consistency Filter Compute a correlation strength per feature Transform(Intensity)  Rank Correlation Strength per feature = |  R -  G |/  (Features)   where  : threshold percentile If you compare the rank of a given feature in R & G channels the ranks should be within  percentile. Example: A feature should be correlated in R & G channels within 5%ile   =0.05 A feature should be correlated in R & G channels within 15%ile   =0.15

71 Page 71 Feature Extraction 7.5 Customer Training Updated 08/18/04 R G 0,0 1. Identify features that pass Rank Consistency Filter 2. Below X%ile cutoff 5. Y%ile cutoff 6. Compute IR Median & IG Median M’ = IR Median /IG Median M’ Projected M’ projected to median fit line IG Median_proj = G BGOffset = gBGAdjust IR Median_proj = R BGOffset = rBGAdjust 3. Median fit to distribution(orange) 4. Add negative controls BGSub – Calculation of Global BG Adjust Values Algorithm determines offset in red and green channels using the features near the central tendency of the data, especially in the lower intensity range

72 Page 72 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSubSignal Calculation BGSubSignal = MeanSignal – BGUsed where BGSubSignal and BGUsed depend on type of background method and settings for spatial detrend and global background adjust

73 Page 73 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSub – Before Global Background Adjustment After background subtraction, a green or red bias may exist at low signal intensity If this bias is uncorrected, the log ratio vs. signal plot of a “self” array will not be symmetric about the log ratio axis

74 Page 74 Feature Extraction 7.5 Customer Training Updated 08/18/04 BGSub – After Global Background Adjustment The background adjustment algorithm corrects the bias in both the red and green channels The resulting log ratio vs. signal plot is symmetrical around the log ratio axis for a “self” array

75 Page 75 Feature Extraction 7.5 Customer Training Updated 08/18/04 New Default Parameters BGSub TabRatio Tab Array Type Background Subtraction Method* Spatial Detrend Adjust Background Globally Auto-estimate Additive Errors** InSitu No background subtraction OnOffOn cDNA No background subtraction OnOffOn 8x Format Local background subtraction Off On Grid Files Minimum signal from feature OffOn (to 0)Off AgilentAgilent

76 Page 76 Feature Extraction 7.5 Customer Training Updated 08/18/04 DyeNorm Algorithm Estimates and corrects for dye bias arising from systematic variation like: Differences in labeling efficiency between two dyes Differences in power settings of two lasers Selects features used as normalization set to evaluate the dye bias Optional – Omit feature with background PopOL from the normalization set Computes dye normalization factors and corrects dye bias using: Linear (Global Method) Linear&LOWESS (LOWESS method preceded by linear method) LOWESS (Local or Non-Parametric Method)

77 Page 77 Feature Extraction 7.5 Customer Training Updated 08/18/04 DyeNorm – Methods for Normalization Feature Selection Selects a set of normalization features to evaluate the dye bias Rank Consistency Filter (Default) Use features falling within central tendency of the data, having consistent trends between the red and green channels “Real-time house keeping genes” Use all significant, non-control, and non-outlier features IsPosAndSignif = 1 for each channel ControlType = 0 for each channel IsFeatNonUnifOL, IsFeatNonPopnOL, and IsSaturated = 0 for each channel Use a list of normalization genes House keeping genes or genes that should not be differentially expressed

78 Page 78 Feature Extraction 7.5 Customer Training Updated 08/18/04 Feature Number Intensity_R I R Rank_R  R Intensity_G I G Rank_G  G R G Identify Features along the Central Tendency Line – Rank Consistency Filter Compute a correlation strength per feature Transform(Intensity)  Rank Correlation Strength per feature = |  R -  G |/  (Features)   where  : threshold percentile If you compare the rank of a given feature in R & G channels the ranks should be within  percentile. Example: A feature should be correlated in R & G channels within 5%ile   =0.05 A feature should be correlated in R & G channels within 15%ile   =0.15

79 Page 79 Feature Extraction 7.5 Customer Training Updated 08/18/04 Features selected using the Rank Consistency Filter

80 Page 80 Feature Extraction 7.5 Customer Training Updated 08/18/04 DyeNorm – Linear Normalization Method Assumes dye bias is NOT intensity-dependent A global approach to dye normalization – forces the average log ratio to zero Problem with this approach – not adequate for cases where bias is intensity- dependent A global constant is determined separately for red and green channels LinearDyeNormFactor is calculated such that geometric mean of the normalization features equals 1000 For example, geometric mean of the normalization features is 250, then the LinearDyeNormFactor is 4

81 Page 81 Feature Extraction 7.5 Customer Training Updated 08/18/04 DyeNorm – LOWESS Normalization Method LOWESS is locally weighted linear regression Handles data that has intensity-dependent dye bias Fits the locally weighted linear regression curve to the normalization features (chosen from selection method) Determines the amount of dye bias from the curve for each feature’s intensity Each feature gets different LOWESS dye normalization factor for each channel

82 Page 82 Feature Extraction 7.5 Customer Training Updated 08/18/04 DyeNorm – Calculation of DyeNormFactor (DNF) For Linear dye normalization method: For Linear&LOWESS dye normalization method: For LOWESS dye normalization method: where n is # features in the normalization set (i.e. features with IsNormalization = 1)

83 Page 83 Feature Extraction 7.5 Customer Training Updated 08/18/04 X Y Linear Fit: y = Slope*x + Intercept + scatter y = m*x + c + e Assumptions in Linear Fit: 1. Scatter is Gaussian about a Mean = 0 2. Standard Deviation of scatter about a point on the curve is independent of the x-variable. DyeNorm – Linear vs Non-Linear Fit

84 Page 84 Feature Extraction 7.5 Customer Training Updated 08/18/04 X Y DyeNorm – LOWESS Locally Weighted Linear Regression

85 Page 85 Feature Extraction 7.5 Customer Training Updated 08/18/04 DyeNorm – LOWESS Changing the Granularity of the Fit X Y X Y

86 Page 86 Feature Extraction 7.5 Customer Training Updated 08/18/04 DyeNorm – Calculating Dye Norm Signal Dye normalized signal is calculated per feature per channel

87 Page 87 Feature Extraction 7.5 Customer Training Updated 08/18/04 Ratio Algorithm Calculates the log ratio of red signal over green signal Log(rProcessedSignal/gProcessedSignal) Calculates significance of log ratio Log ratio error p-value Determines if feature is differentially expressed according to the error model used Auto-estimate additive error values Applies surrogate values to dye normalized signals for more accurate and reproducible log ratio

88 Page 88 Feature Extraction 7.5 Customer Training Updated 08/18/04 Ratio – Error Models Three error models available to estimate random error on log ratio Agilent’s propagated error method based on pixel-level statistics Rosetta’s Universal Error Model (UEM) More conservative error estimate between propagated error and UEM (Default) p-value calculated is based upon the probability of log ratio = 0 Recommend using the more conservative error estimate between propagated error and UEM

89 Page 89 Feature Extraction 7.5 Customer Training Updated 08/18/04 Ratio – Propagated Error vs Universal Error Propagated Error Model Measures the error on the log ratio by propagating the pixel-level error from calculations made in the analysis (e.g. raw signal and background subtraction) Good at capturing the error at the low intensity level Underestimates error at the mid to high intensity level Universal Error Model Measures the expected error between the red and green channels using the additive and multiplicative errors Additive - constant noise term that dominates at low intensity level Multiplicative – intensity scaled term that dominates at high intensity level Good at capturing the error at mid to high intensity level Underestimates error in noisy features, especially at low signal ranges Most conservative estimate of Propagated Error Model and Universal Error Model (Recommended) Evaluates both error models and reports the higher (more conservative estimate) p-value of two error models

90 Page 90 Feature Extraction 7.5 Customer Training Updated 08/18/04 Auto-estimate Additive Error Values In FE 7.1.1, a default additive error constant (25) is used for Agilent in-situ arrays processed using Agilent protocols and scanner. In FE 7.5.1, the additive error value is auto-estimated for each array per channel by looking at: Standard deviation of negative control features Spatial variability of spatial detrend surface. This is RMS difference between each point on the surface and mean of surface. Note: Arrays with less than 500 features will use only negative control features to calculate auto-estimate additive error because the surface cannot be fitted through small number of data points. Auto-estimate of additive error should be used with spatial detrend option turned on Note: Selection of spatial detrend option is independent of selection of auto- estimate of additive error option. Spatial detrend surface will be determined for use of auto-estimate but it won’t be subtracted from data if “spatial detrend” option is NOT selected.

91 Page 91 Feature Extraction 7.5 Customer Training Updated 08/18/04 Ratio – Use of Surrogates Log ratios are calculated from red and green dye normalized signals Dye normalized signals cannot be used to calculate log ratio if: BGSubSignal fails the IsPosAndSignif test BGSubSignal is less than its background standard deviation (i.e. BGSDUsed) If the above cases occur, a surrogate value is used instead of DyeNormSignal Surrogate value is calculated as 1 SD of BG intensities x DyeNormFactor For local background method, SD of BG is at pixel-level of local background For global background method, SD of BG is at background population level on array If surrogate is used, then a non-zero value is displayed in SurrogateUsed column and ProcessedSignal = SurrogateUsed If surrogate is not used, then a zero value is displayed in SurrogateUsed column and ProcessedSignal = DyeNormSignal

92 Page 92 Feature Extraction 7.5 Customer Training Updated 08/18/04 Surrogates: If signal is around the background signal, use Background_SD * DyeNormFactor Case 1: R/G Both channels use DyeNormSignals; p-value and log ratio are calculated as usual. Log ratio error is calculated according to error model chosen by the user. Case 2: r/G r = rSurrogateUsed G = gDyeNormSignal; p-value and log ratio are calculated as usual. If r/G > 1, then FE software automatically sets LogRatio = 0 and pValueLogRatio = 1 Case 3: R/g R = rDyeNormSignal g = gSurrogateUsed; p-value and log ratio are calculated as usual. If R/g < 1, then FE software automatically sets LogRatio = 0 and pValueLogRatio = 1 Case 4: r/g Both channels use surrogates; FE software automatically sets LogRatio = 0 and pValueLogRatio = 1 For signals using surrogates, the g(r)ProcessedSignal is equal to g(r)SurrogateUsed value, used to calculate log ratio.

93 Page 93 Feature Extraction 7.5 Customer Training Updated 08/18/04 Ratio – pValue and Log Ratio Error Calculations Equation 1 Equation 2 xdev is deviation of LogRatio from 0. This is analogous to a signal to noise metric. xDev is displayed in the FEATURES table in FE result file.

94 Page 94 Feature Extraction 7.5 Customer Training Updated 08/18/04 Feature Extraction Results

95 Page 95 Feature Extraction 7.5 Customer Training Updated 08/18/04 Feature Extraction Visual Results Click View > Extraction Results View Results View Outlier Only Hide Outer Local BG Ring Use Simple Colors Click Help > Feature Extraction Output Quick Reference Shape visual results can be viewed only with Feature Extraction.shp files from v.7.1 cannot be opened with v and earlier

96 Page 96 Feature Extraction 7.5 Customer Training Updated 08/18/04 Feature Extraction Text Results

97 Page 97 Feature Extraction 7.5 Customer Training Updated 08/18/04 Know Issues Fixed in FE Know issues fixed in FE 7.5 are available in the Release Note Release Note 7.5 is on the installation CD Downloadable from eRoom and EPI Warehouse

98 Page 98 Feature Extraction 7.5 Customer Training Updated 08/18/04 Known Typographical Errors in FE Manual The following equations are correct and will amended in version 1.1 of FE manual (p. 216), which will be available on Agilent website [r,g]SpatialDetrendRMSFit [r,g]SpatialDetrendRMSFilteredMinusFit

99 Page 99 Feature Extraction 7.5 Customer Training Updated 08/18/04 Visit our website for current info on Feature Extraction Software Download latest: Software 30-day License Example Images User Manual Technical Notes View software showcase


Download ppt "Feature Extraction Software Training Insert AE Name and Date."

Similar presentations


Ads by Google