2 Outline Concepts Hands on exercise (Microarray) Feature introduction Suggested workflowStudio conceptsServer conceptsHands on exercise (Microarray)Feature introduction
3 Suggested workflow Client side Array Studio Server side Array Server Array ViewerRaw dataXpress DataAffymetrix CEL filesRaw text /Excel files(stored in local or shared folder)Server side processing (optional)Shared folder raw filesServer projectsSearchSearch resultsAnalysisAnalysis resultsArray StudioCentral storageProjectsMeta dataShared viewsLists(stored in server)DownloadArray Studio projects(stored in local or shared folder)ProjectsPublishShareShared views
4 Studio concepts: solution Project1Data1View 1.1Data2View 2.1View 2.2View 2.3Project2Data3View 3.1Distributed project: save all data/lists in a folder (recommended for Exon Array/SNP/CNV)Simple project: save all data/lists in a single file (recommended for MicroArray/Taqman)
6 Studio concepts: solution ProjectData types–Omic data (contains data matrix, design, and annotationTable data-Omic dataAnnotationDesignData folderTable dataList typesVariable listObservation listRow listColumn listGeneral listViewsList folderLists
7 Studio concepts: user interface Data viewerSolution ExplorerView controllerDetails window
8 WorkflowsArray Studio has workflows for CNV, SNP, MicroArray, Taqman and Exon array
9 Solution explorer You can open multiple projects in the solution Each project can contain multiple datasetsYou can easily organize your data and lists by foldersYou can rename any data/view/folderA lot of context sensitive functions by right clickingCommonly used right-click functionsAdd viewImport designImport annotationNew folderCopy/paste viewsExportView audit trailView source
10 Data viewer Views are different from graphs They are fully interactive and customizableThe status is stored by projectsYou can open/close views any timeMost views can be saved as PDF/EMF/PowerPoint/ExcelBased on tabs, but you can float any viewDrag the tabs to split the viewerF10 to float tabsMouse over to show the project name and data nameYour active view (not active project) will determine the default selected data
11 View controller Always use view controller to customize your view Task tab: view sensitive menus to customize your viewVariable tab: filter the variables (-omic data)Observation tab: filter the observations (-omic data)Filter tab: filter the observations (table data)Legend: show legend informationFilter status and customized filters are saved with projects, and the filters might be inherited when generating new data!
12 Details windowDetails window show the details for selected variables or observations (depending on the context)
13 Studio concepts: interactivity Array Studio is a fully interactive visualization package (a high dimensional version of SpotFire)Interactivity conceptsFilteringSelection (click, drag or lasso)Hot trackBroadcastingView customization (from task or legend)ExportingQuick demo of the interactivity concepts
14 Studio concepts: Selection vs Filtering Selection shows details on demand for a particular selected row/variable or column/observation (and highlights the selected items in each view for that dataset).Use Selection Menu to clear row and column selections.Filtering “filters” a particular dataset (and all accompanying views) using a set of criteria.
15 Server concepts: what does the server do? FeatureDescriptionCentral storageData repository for all data objects in Array StudioSharingShare your data (under access control) with your colleagues/clientsSearch projectAccess data (and all views) by filtering projects, variables and observationsSearch variable profileGiven an ID (e.g. Probeset), find all the information from all/selected projectsSearch annotationGive a symbol/text, find all IDs that contain the symbol/text in the annotation or all annotations for a given master IDList AnalysisGive a list, find projects that show over representation of significanceSearch segmentUse multiple criteria to search CNV segments across projects/platformsPValue region analysisGiven a chromosome region, find all p-values from all projects and display them in a region viewCNV region analysisGiven a chromosome region, find all CNV values (log2 ratio or allele differences) from all projects and display them in a region viewThis table only displays selected features
16 Statistical algorithms for Expression/Microarray data in Array Studio Proc GLMOne-way ANOVATwo-way ANOVATwo-way Nested ANOVAGeneral Linear Model (fixed, mixed, or random models).Survival ModelProportional hazard regression (Proc tphreg)Logistic RegressionProportion data logistic regression (Proc logistic)Omicsoft’s implementation is independent of SAS/R and Array Command (Array Studio’s command module can be run under Linux (AS can not be run under linux). All the implementations are exact (i.e. not approximations)
18 The hands-on training will focus on UsabilityIf you know how to use Microsoft Office, you will be able to know how to use Array StudioInteractivityWe will have lots of exercise to interact with different viewsPerformanceArray Studio is usually times faster than its competitors
19 Keys for Today/Keys for Success with Array Studio The goal is not to familiarize you with every command in Array Studio that you will ever use. Instead, we hope to give you a good start that you can build on yourself.The key to learning Array Studio, like any complicated software, is to practice with your own data.Don’t worry about “hurting” things. Clicking and trying out new options can only help you learn the software better. With that said, you can save your data, and always return to a previously saved version (using the Save As command).If you don’t see something, or can’t figure something ask, don’t hesitate to ask…..First, consult the Online Help and Frequently Asked Questions ( database, then ask a power-user, or if they cannot help, or are not available, call Omicsoft Support ( ) or atWeb Chat and Remote Support also available at using the Live Help Button
20 List of features to exercise Linear modeling and result explorationLinear modeling-2 Way ANOVAVolcano PlotSummarize Inference ReportVenn diagramInterpretationHierarchical clusteringMolecular signatures analysisPattern and powerFind neighborsAudit trailSignal extractionRMA extractionAttach design tableRaw data visualizationWeb details on demandObservation table viewVariableViewQuality ControlPairwiseScatterViewPCA
22 Workflow WindowThe Workflow Window can be found on the left-hand side of the screen the first time the user starts Array Studio.Workflows are used as a starting place for first-time and novice users of Array Studio.Array Studio offers workflows for Microarray, Taqman, Exon, CNV, and Genotyping analysis.Microarray workflow includes sections for Getting Started, Manage data, Preprocess, Quality Control, Statistical Inference, and Pattern Recognition.The workflows do not contain all the commands and analyses that can be run in Array Studio, but should give the user a good start.
23 Create a New ProjectA project contains all the datasets, results, reports, views, lists, etc. in a single file (for “simple projects” i.e for microarray, Taqman data)It is perfectly fine to share/transfer the project file to another user and the other user will be able to open the project immediately (Array Studio is required)When you create a new project, the project is present in memory until you save it.Now – create a new project by pushing the New Project button in the Microarray WorkflowArray Studio will prompt you to choose a type of project. For Microarray data, it is recommended to create a “simple project”.Click the Browse button, and name the project and select a save location.Click OK to continue.Note: Alternatively, to create a New Project, go to File Menu | New Project or click the New button in the toolbar.
24 Adding Microarray Data/Chip Normalization Choose Add Microarray data from the workflow.Select Affymetrix .CEL files from the sourceAdd all 24.CEL filesPush Submit buttonArray Studio provides fast RMA/GCRMA/MAS5 implementationsThe result is benchmarked with R packages (max difference < 1e-7)Can easily process thousands of chips in a few hoursThe 24 .CEL files ~30 seconds, depending on the computer speedNo memory problemsAlternatively, data can be added by going to the File Menu | Add Data | Add Microarray Data or clicking the Add Data button on the toolbar
25 Attach design table .CEL files generate the Y block (signal matrix) Array Studio automatically attaches the annotation block (A)Design block still needs to be attached to the datasetArray Studio prompts the user to attach the Design Table upon import of data.Click Yes to import Design Table. Choose Tab delimited file and select dbpts.design.txt to attach the Design Table to the dataset.Rename MicroArrayData to DBPTS (right click and choose rename)If you choose no upon import, you can always attach the Design table later on by right-clicking on the Design node for your dataset (in the Project Explorer), and choosing Import.
26 The Solution ExplorerSwitch to the Solution Explorer by finding the tab for it at the bottom of the Workflow Window (or, going to View Menu | Show Solution Explorer.The Solution Explorer is used to organize all the data and views in your project, and allow you to keep open multiple projects at a tme.Imported microarray/genotying/taqman data is organized in the –Omic data section.Generated results will usually be shown in the Table data section.Other important sections include the List section (for creating lists of genes/probesets/etc..), as well as a QC Section, Table Section, Inference Section, etc. (not shown).In Array Studio 3.6, most sections are just “folders” and can easily be changed, but the important thing to remember is that there is an –Omics section and a Tables section.For each Data, Table, Inference Report, etc., the Solution Explorer also maintains the views. Notice the Table view under DBPTS.These views can be closed and opened, and all settings are retained.Try closing the DBPTS\Table View now, then reopening it by double-clicking it in the Project Explorer.
27 The TableView/View Controller The TableView shows the microarray data, with the columns representing each chip, and the rows representing each probeset.The View Controller is found on the right-hand side of Array Studio. It’s responsible for the customization of all views.Switch to the Variable Tab.The Variable Tab and Observation Tab are used for filtering of data.The Variable Tab uses the attached Gene Annotation for columns to filter, while the Observation Tab uses the attached Design Table for columns to filter.Type ^egr1$ into the Gene Symbol filter to filter the TableView for only the gene egr1. (Uses regular expressions)The Observation Tab can also be used to filter the data. Switch to it now, and filter treatment to control. Notice that the TableView is updated to reflect the filter. Note:right-clicking on treatment will offer the option of three different types of filters (radio, checkbox, and string).Clear the Observation tab filter by clicking the (All) radio box or selecting the Reset All Filters tab.
28 Details Window In Array Studio, all views are interactive. Selecting a column header in the TableView or a row header brings up details in the Details Window (found at the bottom of the screen), showing the Design Table information for the selected Observation (Chip) or the Gene Annotation for the selected variable (probeset).The Details Window allows the user to find out on-the-fly information about individual probesets, chips, etc..
29 Web DetailsWeb Details is used to provide users with on-demand web information about particular variables/probesets.Right-click on the selected probeset in the Details Window or main view window.This brings up a list of websites the user can choose to find out info about that probeset.Select Entrez and one of the gene identifiers.Internet Explorer should open containing the web details/Web details allows easy access to Array Server (via Search Variable Profile and Search Variable Data—to be shown later).Also includes access to GeneGo and Ingenuity’s GeneView and Gene Neighborhood functionality.
30 VariableView What is the variable view? Variable view is a highly customizable view designed for high dimensional data. It provides auto-trellis for each variable and shows the profile of each variable in its own paneWhy does Omicsoft think variable view is the most important feature of the software?It is uniqueIt addresses the needs of most biologists: look at the gene profilesIt is highly optimizedIt has many special features that other views do not have, e.g. confidence intervals
31 VariableViewTo add a new view to the DBPTS dataset, right click on the DBPTS node of the Solution Explorer.Click Add View, then select VariableView from the ensuing window. (Alternatively, just choose Add View from the toolbar).Scroll through all ~16000 charts, one for each gene.This view can be customized.Re-filter using the Variable Tab for ^egr1$ so that only one chart is showing.Using the Task Tab of the View Controller, customize this view..Specify Title Columns to include Gene Symbol along with probeset.Specify Profile column to Time.Specify Split column to Treatment.Specify Transformation to Exp2.Why does the X-Axis look strange? What are we looking at? The Column Type is wrong for time…..
32 Column TypeThe VariableView’s X-Axis appears to show the time, on an integer scale.We’d rather it show each time point (1, 3, 6, 18hrs) as individual factors.This can be changed by opening the Design Node of the Solution Explorer for the dataset, then double-clicking the Table view.Column properties can be edited by going to Table Menu | Columns | Column Properties (alternatively, right click on the design column in the table view and choose Column Properties).Select time column, then change Column Type to Factor.
33 VariableView Now switch back to the VariableView. Notice the X axis is now correct.Now click the Show Summary Information button in the Task tab of the View Controller.On-the-fly p-value information is shown for time (profile column), treatment (split column), and the interaction of the two factors. This should not replace a formal analysis, but can be used as a way to quickly find out if a gene is significantly changing.Click the Change Profile Gallery button in the Task tab of the View Controller, and switch to a different view (choose Bar as the gallery type), then click the Show Error Bars button.Switch to the Legend tab of the View Controller to see the Legend for the chart.Any charts can be opened at any point in PowerPointReset all Variable Tab filters now.
34 Variable view: other features LASSO selection-right click and dragControl selection-for choosing multiple pointsF10-for popping the view out (good for multiple screens)Open in ExcelMost of the features also apply to other plots
35 PairwiseScatterViewPairwiseScatterView can be used for QC purposes, to compare biological/technical replicates.It shows a ScatterView comparing chip-to-chip, (bottom left of the view), as well as the MA Plot for each chip comparison.Add a new view, PairwiseScatterView, using the same method used earlier for VariableView.Filter the group column, in the Observation Tab to DBP.t18.The PairwiseScatterView is updated to show only the 3 chips belonging to the DBP treatment at timepoint 18.Notice that one chip 22A, appears to correlate more poorly to the other chips.This is the first indication this is an outlier chip.
36 Principal Component Analysis (2 components) Choose Principal Component Analysis from the Quality Control section of the microarray workflow.Alternative, choose Microarray| QC | Principal Component Analysis from menuMake sure that Demonstration is selected as project.Make sure DBPTS is selected as DataEnsure that 2 components are generatedEnsure that group is selected for Group.Ensure that Calculate Hotelling T2 is selected.Click Submit.
37 Principal Component Analysis (2 components) PCA with two components is generated.Legend available using the Legend Tab.Automatic coloring based on the Group setting.Customize chart using Change Symbol Properties.Change Labels to All, By to chip.Chart is updated, indicating appears to be an outlier.Select chip 22A. Notice Details Window. Point should turn red.Click Exclude Selection in the Task tab of the Project Explorer.This re-runs the PCA, and creates a list, DBPTS.Observation23.This list will be used for further analysis, as it contains the 23 “good” chips.
38 Lists What is a list in Array Studio? A flat list of probesets, chips, genes, etc..Lists can be re-used in other projects.Lists can be used to filter.Lists can be used when running analysis modules to limit the analysis.Variable Lists, Observation Lists, Row, Column, or General lists—Array Studio is smart and only shows context-specific lists.
39 Principal Component Analysis (3-D) Choose Principal Component Analysis from the Quality Control section of the microarray workflow.Alternative, choose Microarray| QC | Principal Component Analysis from menuMake sure that Demonstration is selected as project.Make sure DBPTS is selected as DataEnsure that 3 components are generatedEnsure that group is selected for Group.Click Submit.
40 Principal Component Analysis (3-D) A fully interactive 3-D PCA is returned.Includes trackball tool, panning/zooming tool, and selection tool for interacting with the graph.Functions the same as 2-D plot (changing coloring, excluding selection, etc.)
41 Differential Expression/Two-Way ANOVA Using Workflow, select Two-Way ANOVA from the Statistical Inference section.Set Data to DBPTS.Ensure that all Variables are selected, but use the list DBPTS.Observation23 for Observations.The design of this experiment is 4 time points, with a treatment and control at each time point. Thus, contrasts should be generated for each time point, comparing the treatment (DBP) to control.To figure out the comparisons, read from the top to the bottom.For each, time, Compare to control will create 4 comparisons.Other options include generating F-Test (time, treatment , time*treatment) Pvalues, generating LSMean data, Appending LSMean data to the inference report, and generating estimate data.Click Submit to run the module.
42 General Linear Model Demonstration of General Linear Model Module Two-way ANOVA gives equivalent results—General Linear Model provides much power power and flexibility.
43 Results of Statistical Inference The Two-Way ANOVA generates a table called DBPTS.Tests in the Inference folder in the Tables section.This includes two generated views- Report and Volcano view.In addition, Lists were generated for each comparison, using the alpha level (p-value cutoff) for each comparison. A 5th list is generated, with all the significant probesets in the Two-Way ANOVANote: Lists are generated using the adjusted p-value column, because a multiplicity adjustment was set in the Two-Way ANOVA window.
44 Volcano plotsVolcano plots give a nice overview of the modeling resultsArray Studio automatically sets the layout of the plots to incorporate as much information as possible on one screenFor this particular data, a 2*2 layout is set (2 rows, 2 columns)All the plots are linked (both hot track and selection)A uniform scale could be more informativeDetails on demand could be usefulSelect a probeset in the top right corner of the 1 DBP vs Control and notice that the Details Windows provides on-demand gene annotation info, including p-values, estimates, etc..If you do not see anything on the volcano plot, reset your filter
45 Table reportsVolcano plot is one way to view the modeling results. Table view is another way (so is chromosome view).Usually a table with everything is too big to explore. Filtering is essential.To view the table reports, double click the table view generated by the modeling process.Use Group By Mode to arrange the filters so all the raw pvalues are grouped together (and adjusted pvalues,, estimates, etc..)Create a list that contains probesets significant in all treatmentsFilter 1 DBP vs. Control.RawPValue < 0.05Filter 3 DBP vs. Control.RawPValue < 0.05Filterr 6 DBP vs. Control.RawPValue < 0.05The final number should be 78 rows.Click Add Item, then Add List From Visible Rows, then choose List Source as Probe Set ID.
46 BroadcastingWhat if you’ve filtered one dataset, and want to look at the filtered results in other open tables or datasets?Options:Create a list, then filter in that other dataset by that list.Broadcast the results to all the other open datasets.Cross-Platform broadcastingUses Array Server to map to a “master ID” and then “broadcasts” to the other platforms. Use when looking at multiple platforms (or species).Broadcast your results now using Current Filter->Filter all Opened ViewsReturn to the previously created Variable View
47 Venn diagram view Generate Venn diagram view Right click on Solution Explorer | Data | DBPTS | Views and choose Add ViewChoose VennDiagramView from the listSelect three of your lists from the Solution Explorer and darg and drop into the view.Advanced features: change the title of the plotVenn diagram is also interactiveHint: to compare more than 4lists, you can use Compare Lists featureRemember, our 3 lists were generated with the adjusted p-values, so the number of probesets similar in all three lists should not match the previously created Filtered list
48 Summarize Inference Report Summarize Inference Report used to count the # variables meeting certain criteria.Go to Summarize Inference Report in the microarray workflow, under Statistical Inference.Alternative, go to Microarray Menu | Inference | Summarize Inference Report.Select DBPTS.Tests, Variables all, and all 4 estimates.In Options section, build the conditions. Build Raw Pvalue<0.05 for all conditions, but make one condition for FC>2, FC>3, FC<-2, FC<-3Make sure to name each condition.Table is generated, giving a count for each condition/estimate.Notice the interactivity of the table.
49 Hierarchical clustering Select Hierarchical clustering from the Pattern Recognition section of the microarray workflow.Alternatively, choose Microarray Menu| Pattern | Hierarchical clusteringMake sure DBPTS is the data to be analyzedSelect 18 DBP vs control.Sig379 as the working variable set.Select DBPTS.Observation23 as the working observation setCheck Compute variable treeCheck Generate classic dendrogram view.Push Submit button
50 Dendrogram Interacts with heatmap table view Adjust thumbnail width Adjust thumbnail cell sizesFit thumbnails into windowChange color propertiesSelect branchesSelect thumbnail blocksChange color barsAdjust heatmap cell sizesSpecify annotation columnsSelect Gene Symbol Star
51 Classic DendrogramNot as interactive as the other view, but provides a “flat structure”.Similar options for changing colors and labels.Hint: Right-clicking in the legend allows changing of colors (applies to all views).
52 Molecular signatures analysis Uses the molecular signatures datasbase to find enriched pathways and functions. (Choose Microarray Menu | Annotation | Molecular SignaturesChoose 18 DBP vs Control list.Choose Rat as the organism, and Map by annotation Column Gene Symbol.Click Submit.
53 Molecular signatures analysis Returns a table of GeneSets with p-values.Sort by raw or adjusted p-valueClick on links for regulated genesets.Alternatively, use Microarray Menu | Geneset Enrichment Analysis for the “Classical” version of GSEA
54 Find neighbors Customize the neighbor view Select Find Neigbors from the Pattern Recognition section of the Microarray workflow.Alternatively, choose Mcroarray| Pattern | Find NeighborsDBPTS.Observation23 as the working observation setFind neighbors for _at (if this probeset is the first selected probeset it will be automatically inputted)Change Fixed neighbor number to 20Customize the neighbor viewReset filter if necessaryHide X-axis labelsSort the heatmap columnsAdd sample color barsChange Y-axis label to gene symbolAdd mean/median values
55 GeneGo/Ingenuity Requires access to both systems. Right-clicking on a probeset provides GeneView access to Ingenuity and GeneGoMicroarray menu provides access to GeneGo MetaCore:Upload Data (uploads data with fold changes and p-values for analysis in MetaCore).Microarray menu provides access to Ingenuity:SearchView Canonical PathwayCreate New PathwayUpload Data (uploads data with fold changes and p-values for analysis in Ingenuity).
56 Audit trail Launched from File Menu| Audit trail Audit trail is owned by a project, not owned by a specific data entry. Source, on the other hand, is owned by a specific data entry and describes how the data was generatedOmicScript can be used to re-generate the results
58 Array ServerArray Server contains ~2000 fully analyzed (including p-values, fold changes, etc..) projects from GEO and Array Express. (Note: will soon be 5000 projects—almost all public Affymetrix projects).Publishing BMS data to the server allows integration of internal data with public data.Useful for sharing data between colleagues, using the same views from Array StudioIntegration between the Local Analysis tab and Server Analysis tab in Array Studio.
62 List AnalysisTake a list from your project, and find other projects on the server that have similar results (overrepresentation of that list of genes).
63 Interaction between Local Analysis and Server Explorer. Right-click on a list allows:Search ProfileUpload to Server (saves list to server for quick access at a later point).Server List Analysis (demonstrated previously).
64 Interaction between Local Analysis and Server Explorer. Right-click on a probeset in a table view or details view allows:Search Variable ProfileSearch Variable DataRight-click on Solution in the Solution Explorer allows adding a project from the server directly to the Solution Explorer for further analysis.
65 Omicsoft’s Philosophy If there is something that we do not provide in Array Studio, please ask. We are always adding new features, and it is based on customer feedback, so if there is something you’d like to see, or something that you’d like to see done better, send us a message or give us a call. We always appreciate your feedback.
66 Resources Help Menu | Tutorials Highly recommend going through the Microarray Tutorial yourselfIndividual analysis modules have help buttons.Frequently Asked Questions section of theOmicsoft Support is always available and willing to help you—including remote support (i.e screen sharing) ( OMIC) or