Presentation on theme: "Learning Excel for Data Analysis"— Presentation transcript:
1 Learning Excel for Data Analysis Sessions 5 and 6Dr. Chaitali Basu Mukherji
2 Data Analysis Data Analysis in Excel is performed in multiple ways using the following sections of the Data tab–Get Data – To Connect to external data setSort and FilterData Tools – Data Validation, Duplicate Removal, Consolidation, Data Tables and What If AnalysisOutline – Group and Ungroup, SubtotalsAnalysis – Data Analysis, Solver
3 Group and UngroupGroup and Ungroup in the Outline Group of the Data tabGroup allows you to collapse a group of rows or columnsUngroup reverts the actionFor both functions, an outline with a + or – sign will appear
4 SubtotalsSubtotals is used in a sorted listSort the list on the field for which you want subtotals insertedClick the Subtotal button in the Outline group on the Data tabSubtotal dialog box appears to specify the options for the subtotalsWhen you use the Subtotals command, Excel outlines the data at the same time that it adds the rows with the departmental salary totals and the grand total. This means that you can collapse the data list down to just its departmental subtotal rows or even just the grand total row simply by collapsing the outline down to the second or first level.In a large list, you may insert page breaks every time data changes in the field on which the list is being subtotaled. To do this, select the Page Break between Groups check box in the Subtotal dialog box before you click OK to subtotal the list.Excel does not allow you to subtotal a list formatted as a table. You must first convert your table into a normal range of cells. Click a cell in the table and then click the Table Tools Design tab. Click the Convert to Range button in the Tools group, and then click Yes. Excel removes the filter buttons from the columns at the top of the list while still retaining the original table formatting.Select the field for which the subtotals are to be calculated in theAt Each Change In drop-down listSpecify the type of totals you want to insert in the Use Function drop-downlistSelect the check boxes for the field(s) you want to total in theAdd Subtotal To list boxClick OKExcel adds the subtotals to the worksheet
5 Solver Solver can handle problems that involve many variable cells and can help find combinations of variables that maximize orminimize a target Cell with multiple constraints conditions thatmust be met for the solution to be validProblem Statement: You are planning an advertising campaign for a newproduct with a total print advertising budget of Rs 12,000,000 and want toexpose your ads at least 800 million times to potential readers through sixpublications. Your job is to reach the readership target at the lowest possiblecost with the following additional constraints:At least six advertisements should run in each publicationNo more than a third of advertising dollars be spent on any one publicationCost for placing ad in Pub3 and Pub4 must not exceed Rs 7,500,000.
7 Solver using iteration Let us solve a quadratic equation set using Solver F(x,y) = x^2+y+3 = 0 G(x,y) = 2*x^2+y^3+5 = 0 Solver will use the best estimate method using 100 iterations to come up with a close result
8 Analysis ToolpakTo develop complex statistical or engineering analyses, you can save steps and time by using the Analysis ToolPak. Provide data and parameters for each analysis, and the tool uses appropriate statistical or engineering macro functions to calculate and display the results in an output table. Some tools generate charts along with output tables.
9 Anova AnalysisTwo Factor with Replication: This analysis is useful whendata is classified along two different dimensionsExample: We measure plant height which are given 3different brands of Fertilizer and kept at 2 differenttemperaturesFor each of the six possible pairs, we have an uniqueobservation of plant heightTwo Factor without Replication: This analysis is useful whendata is classified along two different dimensions but there isonly a single observation for each pairAnova analysis tools provide different types of varianceAnalysis using tools like Single Factor, Two-Factor withReplication and Two-Factor without ReplicationThe tool to be used depends on number of factors andnumber of samples that you have from populationsthat you want to testSingle Factor Anova - This tool performs a simple analysis of variance on datafor two or more samples. It provides a test whether each sample is drawnfrom same underlying probability distributionIf there are only two samples, function TTEST can be used. With more thantwo samples, Single Factor Anova model has to be called
10 What is Pivot Table?Pivot Table is used to produce meaningful informationfrom a table of information. For e.g. from a table of data thathas names, addresses, ages, occupations, phone numbers andPin codes, a Pivot Table we easily and quickly find out:How many Salesman work in each region?What is the net car sale of each region?Lets look at comparative sales across products.What is the total Sales for the Company?How many customers do we sell in each region?Lets see how
11 Advantages of Pivot Tables Pivot Tables can generate and extract meaningful information from a large table of information within a matter of minutesIt uses a lot less memory from your PC than if the same results were got using Excels built in functionsIt provides new information by simply drag-and-drop (pivot)Information is updated each time we open the Workbook or by clicking refresh
12 Example for PivotStep 1: Select the data range from which to make the pivot tableStep 2: Go to Insert tab and click on pivot table icon to select Pivot table option Step 3: Excel displays a pivot table wizard where you specify the pivot table target locationStep 4: Make your first pivot report by dragging and dropping fields in the pivot table grid area or by controlling the “Pivot table panel”. The pivot report is divided in to header and body sections. You can drag and drop the fields you want in each area. The body itself contains three parts. Rows, Columns and Cells. You can use any fields in these areas too.
13 Tips on Pivot Tables Formatting is easy for pivot tables You can easily change the pivot table summary formulas by Right click on pivot table and selecting “summarize data by”You can apply conditional formatting on pivot tables although you must be careful as pivot tables scale in size depending on the dataIf original data from which pivot tables are constructed changes, right click on the pivot table and select “Refresh Data” optionTo drill down on a particular summary value, double click on it. Excel will create a new sheet with the data corresponding to that pivot report value. (This is extremely useful)
14 What are Pivot Charts?Charts created on the Pivot tables are called Pivot ChartsThey allow us to create professional interactive charts that are not possible without complex VBA codingHow is our data set-up and do we also want a Pivot ChartThe basic information needed to use the Pivot Wizard are –Where is our data stored eg, range in the same Workbook, a database, another WorkbookWhich column of data is going into which Field i.e. the optional Page field, Row field, Column field and the mandatory Data fieldWhere do you want to put your Pivot Table eg, new Worksheet or existing oneMaking a pivot chart from a pivot table is very simple. Just click on the pivot chart icon from tool bar or Options ribbon area and follow the wizard
15 What is Charting in Excel? Charts are used to display series of numeric data in a graphical format to make it easier to understand large quantities of data and the relationship between different series of dataTo create a chart , you start by entering numeric data which you can plot by selecting chart type that you wantExcel supports many types of charts (such as a column chart or a pie chart) and their subtypes (such as a stacked column chart or a pie in 3-D chart) to help you display data in ways that are meaningful to your audienceYou can create a combination chart by using more than one chart typeSome Chart Types (Column or Bar) can be created by arranging data in rows and columns while others (Pie and Bubble) require special arrangement of dataChart templates can be saved as .crtx files and used like any other template
16 Elements of a Chart Chart Area Plot Area Data Points The Axes Legends TitlesLabel
17 Excel Chart TypesExcel provides facility to do the following types of chart. Atypical use of the Different chart types are mentioned below.Column charts are useful forshowing data changes over a period of timeillustrating comparisons among itemsLine charts are useful fordisplaying continuous data over time, set against a common scaleshowing trends in data at equal intervalsPie charts are useful forshowing the size of individual items in proportional to the sum of the itemsBar charts are useful forcomparisons among individual items
18 Cont. XY (scatter) charts useful for Area charts are useful for displaying and comparing numeric values, in scientific, statistical, and engineering dataArea charts are useful foremphasizing the magnitude of change over timedrawing attention to the total value across a trendStock charts useful forillustrating the fluctuation of stock prices, daily or annual temperaturesSurface charts useful forfinding optimum combinations between two sets of data similar to a topographic mapDoughnut charts useful forshowing the relationship of parts to a whole, and can contain more than one data seriesBubble charts useful forcomparing the sizes of parts that make up the data setRadar chartsRadar charts compare the aggregate values of several data series as opposed to Pie charts that have only one data series
19 Creating a ChartYou select a chart type by choosing an option from the Insert tab's Chart group.After you choose a chart type, such as column, line, or bar, you choose a chart sub-type.For example, after you choose Column Chart, you can choose to have your chart represented as a two-dimensional chart, a three-dimensional chart, a cylinder chart, a cone chart, or a pyramid chart.There are further sub-types within each of these categories.As you roll your mouse pointer over each option, Excel supplies a brief description of each chart sub-type.In Microsoft Excel, you can represent numbers in a chart. On the Insert tab, you can choose from a variety of chart types, including column, line, pie, bar, area, and scatter. The basic procedure for creating a chart is the same no matter what type of chart you choose. As you change your data, your chart will automatically update.
21 Sub Types of Column Chart Clustered column in 3-D – These compare values across categories. It displays 2-D data values using a 3-D perspective. A third value axis (depth axis) is not used.Stacked column in 3-D: Stacked column charts show the relationship of individual items to the whole, comparing the contribution of each value to a total across categories.3-D column: 3-D column charts use three axes that you can modify (a horizontal axis, a vertical axis, and a depth axis), and they compare data points along the horizontal and the depth axes.Cylinder, cone, and pyramid: Cylinder, cone, and pyramid charts are available all the above typesCylinder, cone, and pyramid: Cylinder, cone, and pyramid charts are available all the above types with only the shape being cylinder, cone or pyramid instead of rectangle
22 Applying a Chart Layout Context tabs are tabs that only appear when you need them Called Chart Tools, there are three chart context tabs: Design, Layout, and Format.The tabs become available when you create a new chart or when you click on a chart.You can use these tabs to customize your chart.You can determine what your chart displays by choosing a layout.The layout you choose determines whether your chart displays a title, where the title displays, whether your chart has a legend, where the legend displays, whether the chart has axis labels and so on.Excel provides several layouts from which you can choose.
28 Communicating through Data Communicating through data is most effective if we understand the basic rulesThere are 7 common relationships in Quantitative Business DataTypical questions that arise on number presentation are –Compared to what?At what instant?In which sequence?Relative to what other?How much is the deviation?What kind of distribution does it follow?Is there any special Correlation between them?
29 Time-Series Relationships This is the most common relationship in quantitative business dataWhen quantitative values are expressed as a series of measures taken across equal intervals of time, this relationship is called a time seriesStudies indicate that approximately 75% of all business graphs display time seriesTime can be divided into intervals of varying duration, including years, quarters, months, weeks, days, and hoursTime series reveal trends and patterns that we must be aware of and understand to make informed decisions
30 Ranking Relationships It is most meaningful in business to see things ranked, such as the performance of sales people or the expenses of departmentsWhen quantitative values are sequenced by size, from large to small or vice versa, this relationship is called a rankingThis not only reveals their sequence, but makes it much easier to compare values by placing those that are most similar near one another.
31 Part-to-Whole Relationships It is often useful to see how something is divided into parts, and the percentage relationship of each part to the wholeWhen quantitative values are displayed to reveal the portion that each value represents to some whole, this is called a part-to-whole relationshipSome typical examples are how a market is divided up between competitors, or expenses are divided between regions as shown below
32 Deviation Relationships When quantitative values are displayed to feature how one or more sets of values differ from some reference set of values, this is called a deviation relationshipThe most common example in business is one that shows how some set of actual (such as expenses) deviate from a predefined target (such as a budget)
33 Distribution Relationships When we show how a set of quantitative values are spread across their entire range, this relationship is called a distributionWe can often learn a great deal by examining the distribution of a set of values, especially the shape of that distribution, which reveals what’s typical, if it is skewed in one direction or the other, and if there are gaps or concentrationsThis shows a distribution of values that is fairly symmetrical, approaching what is called a normal or bell-shaped curve
34 Correlation Relationships When pairs of quantitative values, each measuring something different about an entity (for example a person, department, or product), are displayed to reveal if there is significant relationship between them (for instance, as one goes up the other goes up as well, or as one goes up the other goes down), this is called a correlationUnderstanding correlations between quantitative variables can help us predict, take advantage of, or avoid particular behaviorsCorrelation between employee’s heights in inches (y axis) and their salary in dollars (x axis) is shown below
35 Nominal Comparison Relationships This chart is called a nominal comparison relationship where there is not particular relationship between the valuesFour geographical regions do not relate to each other in any particular orderIt does provide a means to compare the regional values, but nothing moreIt is always useful, whenever you prepare a graph that displays nothing but a nominal comparison, to ask yourself if another relationship could be featured that would make the graph more meaningfulIn this case, simply arranging the regions in order of their quantitative values could produce a ranking relationshipOften discrete items in a categorical variable, like these geographical regions, need to be arranged in a particular order because people expect to see them arranged in that way
36 Tip for Selecting Right Chart Type What to representChart Type to useNominal ComparisonBarPointTime SeriesLineRankingPart to WholePieStacked BarBubbleDeviationFrequency DistributionHistogramFrequency PlotCorrelationScatter Plot with trend line
37 Best Practices of Charting Determine Your Message and Identify Your DataFormat Graphs to focus on the message removing unnecessary DistractionsCheck out if a Table, a Graph, or Both Is Needed to Communicate Your Message most effectivelyDetermine the best place in the Charting area to Display Each VariableTake special care on Legend Placement
38 Tips for Enhancing Chart Performance Use tables to hold the dataUse named ranges, named formulasUse Pivot TablesSort your dataUse Manual Calculation ModeUse Non-volatile formulasThese formulas are re-calculated whenever there is a change in the workbook. Examples of volatile formulas are RAND, NOW, TODAY, OFFSETKeep formulas in a separate sheetWrite better formulas
39 Tips for a Good ReportRestrict The Work Area to relevant Columns and Rows onlyLock Formula Cells And Protect The WorksheetFreeze Panes So that Your boss Knows what she is ReadingHide Un-necessary / Calculation SheetsHide Rows / Columns not used in reportInclude Cell – Comments / Help MessagesUse Consistent Colors And SchemesName and Color Worksheet Tabs AppropriatelyBefore Closing The Workbook, Select Cell A1 On The Correct Sheet