Business Intelligence The processes, technologies, and people to turn data into information in order to drive profitable business action. - Wayne Eckerson, TDWI Source: B. Wixom
BI and Analytics Analytics is “the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions” (Davenport and Harris – Competing on Analytics) “BI refers to the general ability to organize, access and analyze information in order to learn and understand the business.” (Gartner)
High quality data Accurate Timely Valid Business Value from Data Based on work by B. Wixom StrategyStrategy (big)Data Use Business Value Usable data Awareness Access Usefulness Security Privacy Useful data Meaning Scope Sharing
GIGO: data quality affects the quality of your decisions Analysts cannot find what they need 50% of the times 10-25% of the records have inaccuracies or missing elements Data is frequently misinterpreted Data loss and theft Most databases implement inconsistent definitions Source: T. Redman, Data Driven, 2008
Why is Data Bad? No one gets up in the morning and says “I’m going to make lots of errors today” Source: T. Redman, Data Driven, 2008
Data Quality Benchmarks Analysts cannot find what they need 50% of the times 10-25% of the records have inaccuracies or missing elements Data frequently misinterpreted Known data loss and theft Most databases implement inconsistent definitions Source: T. Redman, Data Driven, 2008
Approaches to Data Quality 1.Find and Fix 2.Prevent at the source 3.Do nothing (3M)
Financial Information Management Homework Business Scenario: Google’s Daily Cagr
Daily Cagr for Google You are an analyst at a broker firm. Many of our customers invest for short amounts of time on Google. They sell their shares within a few weeks…. I wonder: do they make any money out of it?
Daily Cagr for Google file with ~800 customers who bought and sold GOOG within the last two months. Three steps (and two homework) 1.Clean data: phones, dates 2.Compute Daily Cagr = [(final price/initial price) 1/days ]-1 3.Report the Average Daily Cagr across all customers.
Cleaning Phone Numbers From: #2-345-3-48565 To: (234)-534-8565
When the user presses a button labeled “start”, a file selection windows pops out. The user selects a.csv file. The file is shown starting at “A1”. The start button becomes invisible. Three more buttons appear: “Clean phone numbers”, “Format Dates”, and “Compute Daily CAGR”. When the user presses a button labeled “start”, a file selection windows pops out. The user selects a.csv file. The file is shown starting at “A1”. The start button becomes invisible. Three more buttons appear: “Clean phone numbers”, “Format Dates”, and “Compute Daily CAGR”. UML Activity Diagram - Daily Compound Average Growth of a Security (part I) Select the next phone no. Count its digits [Compute] [Exactly 10 digits] Next homework [Clean ph.no] Highlight the cell in red Format as (xxx)-xxx-xxxx & clear highlight if any Format as (xxx)-xxx-xxxx & clear highlight if any [No More Ph.No] [Format Dates] A A Select the next column and/or date [is a date] Highlight the cell in yellow Format as mm/dd/yyyy & clear highlight if any Format as mm/dd/yyyy & clear highlight if any [No More Dates in this column] A [No more columns]
Reading a File into EXCEL ' store the address of the current active sheet, i.e., the ‘target’ Dim myActiveS As Excel.Worksheet = Application.ActiveSheet ' select a file Dim myFile As String = Application.GetOpenFilename() ' get the data in a new temporary workbook Application.Workbooks.OpenText(myFile,,, Excel.XlTextParsingType.xlDelimited,,,,, True) ' store the address of the temporary workbook Dim myActiveWB As Excel.Workbook = Application.ActiveWorkbook ' copy the content from the temporary to the ‘target’ sheet myActiveS.Range("A1:J1000").Value = Application.ActiveSheet.Range("A1:J1000").Value ‘ close the temp workbook myActiveWB.Close()
Finding the last non-empty row Dim lastRow As Integer lastRow = _ Cells(Rows.Count,1).End(Excel. XlDirection.xlUp).Row
Financial Information Management WINIT What Is New In Technology?
Financial Information Management Strings and Dates
Strings and Characters Dim myString As String = “This is a sample string" Dim myString2 As String = "s" Dim myChar As Char = "s"c
Testing Numbers Dim myString As String = "#2344-234-33-3" Dim temp As String = "" For Each x As Char In myString If IsNumeric(x) Then temp = temp + x End If Next
Inserting and Removing Dim myS As String = "This is a sample string" myS = myS.Insert(4, "xyz") myS = myS.Remove(4, 3) 'starting where, how many myString = myS.Replace(" is", " was") myS = myS.Substring(0, 9) + “ another" + myS.Substring(10, 13) + "."
Finding Dim myS As String = "This is a sample string" Dim myPosition As Integer = 0 myPosition = myS.IndexOf("s")
Trimming and Padding myLenght = myString.Length myNewString = myString.Trim() myNewString = myString.TrimEnd() myNewString = myString.TrimStart() myNewString = myString.PadLeft(50) myNewString = myString.PadRight(20) Total length of the result
You do the talking Name, major Learning objectives Things you like about the class Things that can be improved Strengths / Attitude towards the Tournament
Dates Dim myDate As Date = "11/14/2002“ Year = myDate.Year Month = myDate.Month Day = myDate.Day DOW = myDate.DayOfWeek DOY = myDate.DayOfYear... MyDate Year Month Day Week....... 2002 11 14 45
TimeSpan Dim myDate1 As Date Dim myDate2 As Date Dim myTS As TimeSpan myDate1 = Range("A1").Value myDate2 = Range("A2").Value myTS = myDate2 - myDate1 Range("A3").Value = myTS.Days TIMESPAN Date1Date2 A TimeSpan represents the elapsed time between two dates.
TimeSpan mySpan.Days gives you the total number of days mySpan.TotalDays gives you the total number of days, plus a fraction of day based on the hours