Presentation on theme: "Learning Excel for Data Analysis"— Presentation transcript:
1Learning Excel for Data Analysis Sessions 3 and 4Dr. Chaitali Basu Mukherji
2Data Analysis Data Analysis in Excel is performed in multiple ways using the following sections of the Data tab–Get Data – To Connect to external data setSort and FilterData Tools – Data Validation, Duplicate Removal, Consolidation, Data Tables and What If AnalysisOutline – Group and Ungroup, SubtotalsAnalysis – Data Analysis, Solver
3Importing Data in Excel Data can be imported into Excel from Text files, from HTML files andAccess DatabasesThe main benefit of connecting to external data is –you can periodically analyze this data without repeatedly copying the data, which is both time-consuming and error-proneyou can automatically refresh (or update) your Excel workbooks from the original data source whenever the data source is updated with new information
4From Access DatabaseMethod 1 – you can copy data from a datasheet view and then paste the data into an Excel worksheetStart Access, and open the table, query, or form that contains the records you want to copyOn the Home tab, click View, and then click Datasheet ViewSelect the records that you want to copy along the row or if you want to select specific columns, drag across adjacent columnsOn the Home tab, in the Clipboard group, click CopyStart Excel, and then open the worksheet where you want to paste dataGo to cell A1 and then On the Home tab, in the Clipboard group, click PasteNote: To ensure that the copied records do not replace existing records, make sure that the worksheet has no data below or to the right of the cell that you click
5From Access DatabaseClick the cell where you want to put the data from the Access databaseOn the Data tab, in the Get External Data group, click From Access.Locate and double-click the Access database that you want to importIn the Select Table dialog box, click the table or query that you want to import, and then click OKIn the Import Data dialog box,Under Select how you want to view this data in your workbook, do select Table or select PivotTable report or select PivotChart and PivotTable reportOptionally, click Properties to set refresh, formatting, and layout options for the imported data, and then click OK.Under Where do you want to put the data? do one of the following:To return the data to the location that you selected, click Existing worksheet.To return the data to the upper-left corner of the new worksheet, click New worksheet.Click OKExcel puts the external data range in the location that you specify.Method 2 – Bring Access data that can be refreshed into ExcelCreate a connection, and store it in an Office Data Connection file (.odc), to the Access database and then retrieve all of the data from a table or queryThe main benefit of connecting to Access data instead of importing it is that you can periodically analyze this data in Excel without repeatedly copying or exporting the data from AccessAfter you connect to the data, you can also automatically refresh (or update) your Excel workbooks from the original Access database whenever the database is updated with new information
6Trust CentersConnection files reside in Trust Centers. Typical Microsoft Trust centers are:drive\Program Files\Microsoft Office\Templatesdrive\Program Files\Microsoft Office\Office12\StartupCreating Trust CentersClick the Microsoft Office Button , and then click Excel Options.Click Trust Center, click Trust Center Settings, and then click Trusted Locations.To create a trusted location that is not local to your computer, select the Allow trusted locations on my network (not recommended) check box.Click Add new location.In the Path box, type the name of the folder that you want to use as a trusted location, or click Browse to locate the folder.To include subfolders as trusted locations, select the Subfolders of this location are also trusted check box.In the Description box, type what you want to describe the purpose of the trusted location.Click OK.
7From TextStep 2 (Delimited)Delimiters - Select the character that separates values in your text file. If the character is not listed, select Other check box, and then type the characterTreat consecutive delimiters as one - Select this check box if your data contains a delimiter of more than one character between data fields or if your data contains multiple custom delimitersText qualifier - Select the character that encloses values in your text file. When Excel encounters the text qualifier character, all of the text that follows that character and precedes the next occurrence of that character is imported as one value, even if the text contains a delimiter character.If delimiter is a comma (,) and text qualifier is a quotation mark ("), "Dallas, Texas" is imported into one cell as Dallas, Texas. If no character or the apostrophe (') is specified as the text qualifier, "Dallas, Texas" is imported into two adjacent cells as "Dallas and Texas”If the delimiter character occurs between text qualifiers, Excel omits the qualifiers in the imported valueIf no delimiter character occurs between text qualifiers, Excel includes the qualifier character in the imported value. Hence, "Dallas Texas" (using the quotation mark text qualifier) is imported into one cell as "Dallas Texas“.Step 2 (Fixed width data)Data preview - Set field widths in this section. Click the preview window to set a column break, which is represented by a vertical line. Double-click a column break to remove it, or drag a column break to move it.Step 3 - Click the Advanced button to do one or more of the following:Specify the type of decimal and thousands separators that are used in the text file. When the data is imported into Excel, the separators will match those that are specified for your location in Regional and Language Options or Regional Settings (Windows Control Panel).Specify that one or more numeric values may contain a trailing minus sign.Column data format - Click the data format of the column that is selected in the Data preview section. If you do not want to import the selected column, click do not import column (skip).After you select a data format option for the selected column, the column heading under Data preview displays the format. If you select Date, select a date format in the Date box.Step 1Original data type - If items in the text file are separated by tabs, colons, semicolons, spaces, or other characters, select Delimited. If all of the items in each column are the same length, select Fixed width.Start import at row - Type or select a row number to specify the first row of the data that you want to import.File origin - Select the character set that is used in the text file. In most cases, you can leave this setting at its default. If you know that the text file was created by using a different character set than the character set that you are using on your computer, you should change this setting to match that character set.Preview of file - This box displays the text as it will appear when it is separated into columns on the worksheet.
9Sort Data can be sorted by: Data entered into your worksheet is often unorganized making it difficultto examine. When analyzing information, you may need to rearrange thedata in different ways to answer different questions.Excel's sorting feature can help your rearrange your data so you can use itmore efficientlyIf your spreadsheet contains formulas, be careful when using the sortFeature as Formulae rely on cell references to perform their calculationsand moving the data with the sort feature may destroy these references.Data can be sorted by:Selecting a single cell in the column containing the data to sortSelect an entire column
10Filter Filtering is a way that you can use Excel to quickly extract certain data from your spreadsheet.Unlike sorting, filtering doesn't just reorder the list butactually hides the rows or columns containing data that donot meet the filter criteria defined.Excel has AutoFilter that makes it easy to extract dataClick on any cell in your spreadsheet.Under the Data tab, select the Filter buttonDrop-down menus will appear next to each cell heading
11Convert Text to ColumnIf you have a cell that contains a lot of text, you may wish to separate it into several columns. This can only be done if there is a logical character that separates the text, such as a comma, semi-colon or full stop. For example, cells containing Last Name, First Name can be separated into two different columns
12Remove DuplicatesA duplicate value is one where all values in the roware an exact match of all the values in another row.They are determined by the value displayed in thecell and not necessarily the value stored in the cell.For example, if you have the same date value in differentcells, one formatted as "3/8/2009" and the other as "Mar8, 2009", the values are unique.It's a good idea to filter for or conditionally format uniquevalues first to confirm that the results are what you wantbefore removing duplicate values.Select the range of cells, or make sure that the active cell is in a table.On the Data tab, in the Data Tools group, click Remove Duplicates.Do one or more of the following:Under Columns, select one or more columns.To quickly select all columns, click Select All.To quickly clear all columns, click Unselect All.If the range of cells or table contains many columns and you want to only select a few columns, you may find it easier to click Unselect All, and then under Columns, select those columns.Click OK.Excel displays a message indicating how many duplicate values were removed and how many unique values remain, or if no duplicate values were removed.Note: You cannot remove duplicate values from data that is outlined or that has subtotals. To remove duplicates, you must remove both the outline and the subtotals.
13Filter for Unique Values Select the range of cells, or make sure the active cell is in a table.On the Data tab, in the Sort & Filter group, click Advanced.In the Advanced Filter dialog box, do one of the following:To filter the range of cells or table in place, click Filter the list, in-place.To copy the results of the filter to another location, do the following:Click Copy to another location.In the Copy to box, enter a cell reference.Select the Unique records only check box, and click OK.The unique values from the selected range are copied to the new location. The original data is not affected.Filtering for unique values and removing duplicate values are two closely related tasks because the displayed results are the same — a list of unique values. The difference, however, is important: When you filter for unique values, you temporarily hide duplicate values, but when you remove duplicate values, you permanently delete duplicate values.
14Conditional Formatting Advanced formattingSelect one or more cells in a range, table, or PivotTable report.On the Home tab, in the Styles group, click the arrow next to Conditional Formatting, and then click Manage Rules.The Conditional Formatting Rules Manager dialog box appears.Do one of the following:To add a conditional format, click New Rule.The New Formatting Rule dialog box appears.To change a conditional format, do the following:Make sure that the appropriate worksheet or table is selected in the Show formatting rules for list box.Optionally, change the range of cells by clicking Collapse Dialog in the Applies to box to temporarily hide the dialog box, by selecting the new range of cells on the worksheet, and then by selecting Expand Dialog .Select the rule, and then click Edit rule.The Edit Formatting Rule dialog box is displayed.Under Select a Rule Type, click Format only unique or duplicate values.Under Edit the Rule Description, in the Format all list box, select unique or duplicate.Click Format to display the Format Cells dialog box.Select the number, font, border, or fill format that you want to apply when the cell value meets the condition, and then click OK.You can choose more than one format. The formats that you select are displayed in the Preview box.Quick formattingSelect a column in the tableOn the Home tab, in the Styles group, click the arrow next to Conditional Formatting, and then click Highlight Cells Rules.Select Duplicate Values.Enter the values that you want to use, and then select a format.
15Data Validation – Entry Restriction In the Data Validation dialog box, click the Settings tabClick on the Allow box then select List from the drop-down listClick the Source box and then type the valid values separated usually a comma “,” or semicolon “;”. For example if the cell is for a color of a car then you can limit the values by entering : Silver, Green, BlueInstead of typing your list manually, you can create the list entries by referring to a range of cells in the same worksheet or another worksheet in the workbookTo specify the location of the list of valid entries, do one of the following:If the list is in the current worksheet, enter a reference to your list in the Source box, for example enter: =$A$1:$A$6If the list is on a different worksheet, define a name for your list then enter the name that you defined for your list in the Source box, for example, enter: =ValidColorsTo avoid less junk in your data entry process, it issometime essential to restrict the choice of valuesin specific columns or cells by using the drop-downlist. Here are the steps to follow :Select the cell to validateOn Data tab, in Data Tools group, click Data ValidationThe Data Validation dialog box opens
16Other Data Restrictions You can restrict the entry in a cell using the Datatypes of Excel by choosing Whole Number, Decimal, Date or TimeFor all the above, you need to specify the range between which you want to restrictFor Text, you need to specify the range of the length of the Text fieldYou can also define your own restriction through use of Excel Formula
17Data Validation - TipsIn-cell dropdown check box must be selected, otherwise, you won’t be able to see the drop-down arrow next to the cellSelect or clear the Ignore blank check box depending on how you want to handle blank (null) valuesIf you use defined name and there is a blank cell anywhere in that range, selecting the Ignore blank check box allows any value to be entered in the validated cellIf any referenced cell is blank, selecting the Ignore blank check box allows any value to be entered in the validated cell.If you change the validation settings for a cell, you can automatically apply your changes to all other cells that have the same settings. To do soOpen the Data Validation dialog boxClick the Settings tabSelect the Apply these changes to all other cells with the same settings check box
18Data Validation – Other Options Display an input message when the cell is clickedClick the Input Message tabSelect Show input message when cell is selected check boxFill in the Title and text for the Input messageDisplay an error message when wrong data is enteredClick the Error Alert tabSelect Show error alert after invalid data is entered check boxFill in the Title and text for the Error messageSelect one of the following options for the Style box:Information: Display an information message. Does not prevent entry of invalid dataWarning: Display a warning message. Does not prevent entry of invalid dataStop : Prevent entry of invalid data
19Validate Cell Values Select the cell you want to validate On the Data menu, click Validation, the Data Validation dialogue box will be shownTo show an error message when an invalid data is entered:Click the Error Alert tabFill the Title and Error Message edit boxes with appropriate textClick the Settings tab.Specify the type of validation you want. Suppose you have a numeric value and you want to allow only values between 1 and 99, then:In the Allow combo box select ‘Decimal’.In the Data combo box select ‘Between’.In the Minimum edit box enter ‘1’.In the Maximum edit box enter ‘99’.
20Validate Cell Value based on another Cell Value Suppose we have a list of documents each withIssue and Expiry dates. We want to validateexpiry date so that it is always greater than issueDateSelect the cellOn the Data menu select Validation and go to the Settings tabIn the Allow list box select Date and in the Data list box select Greater ThanIn the Start Date: box type “=Address of the issue date ″
21Consolidate Give Names to List1, List2 and List 3 This feature allows multiple lists to be combined and presented in one sheetThe following rules enable the consolidation of Lists using the Consolidate command:The structure of the Lists must be identicalThe headings of all rows and the leftmost columns in the Lists must contain the same topicThe number of columns and the number of rows do not have to be identical; nor does the internal order of the text.The Lists must have a single row for labels, and a single column for labelsThe cells in the Lists data range must contain only numeric dataExcel consolidates data by identifying corresponding text crossed between the header row and the leftmost columnGive Names to List1, List2 and List 3Select a cell in a different sheet of the workbook, and select Data -> Consolidate (in Data Tools Group). 4. In the Reference box, press F3. 5. In the Paste Name dialog box, select List1, click OK, and then click Add to add List1 to All references box. 6. Repeat steps 4 and 5, and add List2 and List3 to All references box. 7. In Use Labels in, select the Top row and Left column checkboxes, and then click OK.
22What IF AnalysisThere are 3 kinds of what-if analysis tools in Excel: Scenarios, Data Tables, and Goal Seek.Scenarios and Data Tables take sets of input values and determine possible results.Data table works only with one or two variables, but it can accept many different values for those variables.A scenario can have multiple variables, but it can accommodate only up to 32 values.Goal Seek works differently from scenarios and data tables.It takes a result and determines possible input values that produce that result.
23Scenario Reports are not automatically recalculated. Scenario ManagerScenario Manager can create multiple scenarios on the same worksheet, and then switch between them. For each scenario, specify the cells that change and the values to use for that scenario. When switching between scenarios, the result cell changes to reflect the different changing cell valuesWorst Case ScenarioBest Case Scenario1. Changing cells2. Result cellScenario Reports are not automatically recalculated.
24Scenario ExampleProblem Statement: Assume you own a book store and have 100 books in store. You sell a certain % for higher price of $50 and a certain % for lower price of $20If you sell 60 % for the highest price, cell D10 calculates atotal profit of 60 * $ * $20 = $3800
25Create Different Scenarios Type a name (60 % highest), select cell C4 (% sold for the highest price) for the Changing cells and click on OKWhat if you sell 70% for the highest price?What if you sell 80% for the highest price?What if you sell 90% for the highest price?What if you sell 100% for the highest price?You can type in a different percentage into cell C4 to see corresponding resultof a scenario in cell D10. However, What-If Analysis enables you to easilycompare the results of different scenarios.On Data tab, click What-If Analysis and select Scenario Manager from listThe Scenario Manager dialog box appearsAdd a scenario by clicking on AddNext, add 4 other scenarios (70%, 80%, 90% and 100%). ScenarioManager shows the picture below:. Enter the corresponding value 0.6 and click on OK againTo see result of a scenario, select the scenario and click on the Show button. Excel will change the value of cell C4 accordingly for you to see the corresponding result on the sheet
26What If Analysis Result To easily compare the results of these scenarios, do the followingClick the Summary button in the Scenario ManagerNext, select cell D10 (total profit) for the result cell and click on OKResult:Conclusion: If you sell 70% for the highest price, you obtain a total profit of $4100, if you sell 80% for the highest price, you obtain a total profit of $4400, etc.
27Goal SeekGoal Seek is a built in Excel tool that allows you to see how one data item in a formula impacts another. You might look at these as “cause and effect” scenarios.You need to borrow some money. You know how much money you want, how long a period you want in which to pay off the loan, and how much you can afford to pay each month. Use Goal Seek to determine what interest rate you must secure in order to meet your loan goalIf you want to determine more than one input value, for example, the loan amountand the monthly payment amount for a loan, you should instead use the Solver add-in
28Goal Seek Example Tips: The Set cell in Step 2 must contain a formula. Create the following spreadsheet in ExcelProblem: Local election results is being studied where 2/3 of the voters needto vote YES for cabinet formationVotes% of VotesYES447863.90 *NO253036.10Total7008100Tips:The Set cell in Step 2 must contain a formula.The cell you change in Step 5 can't contain a formula. It must be a typed value.Observation: YES votes are a majority, but short of the required 2/3 approval to win the election. YES group is close, but how close? What would've made a difference?Using Goal Seek we can change the value of various cells to see how the results change.This would allow you to answer these types of questions.How many “NO” voters needed to be converted to YES to win the election?How many more votes were needed by the YES team to win the election?If 500 more people voted could the YES team have won?In each of these questions, the goal is to change a data value to see if the YESpercentage went over that two-thirds mark or 66.67%. Rather than haphazardly changing cell values to see the results, Goal Seek can find the answers.Click the cell you want to change. This is called the “Set cell”.Select Goal Seek…and in the Goal Seek dialog, enter the new “what if” amount in the To value text boxHere, we're asking Excel to replace the contents of cell D4 which is 63.90% with 66.67% which is the percentage needed to win the election.We also need to tell Excel which cell to change. Since we wanted to know the number of YES votes, we'll click C4
29Data Tables Data Tables are used when – when a formula uses one or two variables, or multiple formulas use one common variableall the outcomes in one place are to be seen in one placea range of possibilities are to be seen at a glancefocus on only one or two variables and results are easy to read and share in tabular formIf automatic recalculation is enabled for workbook, data tablesrecalculate with fresh dataIt can’t accommodate more than 2 variables but can handle asmany values as you want
30One-variable data tables This is used to see how different values of one variable in one ormore formulas will change the results of those formulas.Example: you can use a one-variable data table to see howdifferent interest rates affect a monthly mortgage payment byusing the PMT function. You enter the variable values in one column orrow, and the outcomes are displayed in an adjacent column or row.On the Data tab, in the Data Tools group, click What-If Analysis, and then click Data TableType the list of values that you want to substitute in the input cell either down one column or across one row. Leave a few empty rows and columns on either side of the values.Do one of the following:If the data table is column-oriented (your variable values are in a column), type the formula in the cell one row above and one cell to the right of the column of values.If the data table is row-oriented (your variable values are in a row), type the formula in the cell one column to the left of the first value and one cell below the row of values.Select the range of cells that contains the formulas and values that you want to substitute. Based on the first illustration in the preceding section, this range is C2:D5.D2 contains the payment formula, =PMT(B3/12,B4,-B5),which refers to the input cell B3.
31Two-variable data tables A two-variable data table uses a formula that contains two lists of input values. The formula must refer to two different input cells.In a cell on the worksheet, enter the formula that refers to the two input cells. In this example, in which the formula's starting values are entered in cells B3, B4, and B5, you type the formula =PMT(B3/12,B4,-B5) in cell C2.Type one list of input values in the same column, below the formula. In this case, type the different interest rates in cells C3, C4, and C5.Enter the second list in the same row as the formula, to its right. Type the loan terms (in months) in cells D2 and E2.Select the range of cells that contains the formula (C2), both the row and column of values (C3:C5 and D2:E2), and the cells in which you want the calculated values (D3:E5).In this case, select the range C2:E5.On the Data tab, in the Data Tools group, click What-If Analysis, and then click Data Table.In the Row input cell box, enter the reference to the input cell for the input values in the row.Type B4 in the Row input cell box.In the Column input cell box, enter the reference to the input cell for the input values in the column. Type B3 in the Column input cell box.Click OK.This is used to see how different values of two variables in one formula will change the results of that formula. Example: you can use a two-variable data table to see how different combinations of interest rates and loan terms will affect a monthly mortgage payment.C2 contains the payment formula, =PMT(B3/12,B4,-B5),which uses two input cells, B3 and B4.
32Group and UngroupGroup and Ungroup in the Outline Group of the Data tabGroup allows you to collapse a group of rows or columnsUngroup reverts the actionFor both functions, an outline with a + or – sign will appear
33SubtotalsSubtotals is used in a sorted listSort the list on the field for which you want subtotals insertedClick the Subtotal button in the Outline group on the Data tabSubtotal dialog box appears to specify the options for the subtotalsWhen you use the Subtotals command, Excel outlines the data at the same time that it adds the rows with the departmental salary totals and the grand total. This means that you can collapse the data list down to just its departmental subtotal rows or even just the grand total row simply by collapsing the outline down to the second or first level.In a large list, you may insert page breaks every time data changes in the field on which the list is being subtotaled. To do this, select the Page Break between Groups check box in the Subtotal dialog box before you click OK to subtotal the list.Excel does not allow you to subtotal a list formatted as a table. You must first convert your table into a normal range of cells. Click a cell in the table and then click the Table Tools Design tab. Click the Convert to Range button in the Tools group, and then click Yes. Excel removes the filter buttons from the columns at the top of the list while still retaining the original table formatting.Select the field for which the subtotals are to be calculated in theAt Each Change In drop-down listSpecify the type of totals you want to insert in the Use Function drop-downlistSelect the check boxes for the field(s) you want to total in theAdd Subtotal To list boxClick OKExcel adds the subtotals to the worksheet
34Analysis ToolPak and Solver For Analysis ToolPak refer to the document on Data Analysis Let us solve a quadratic equation set using Solver. F(x,y) = x^2+y+3 = 0 G(x,y) = 2*x^2+y^3+5 = 0 Solver will use the best estimate method using 100 iterations to come up with a close result