Presentation on theme: "Detectlets for Better Fraud Detection"— Presentation transcript:
1Detectlets for Better Fraud Detection Conan C. Albrecht, PhDMarriott School of ManagementBrigham Young University
2Today’s PresentationGive a few fraud storiesOutline the Detectlet vision and Picalo ArchitectureShow example code and working productsDescribe future research directions and solicit help
3Fraud on behalf of an organization Two Types of FraudFraud on behalf of an organizationFinancial statement manipulation to make the company look better to stockholdersAlso called management fraudFraud against an organizationStealing assets, information, etc.Also called employee or consumer fraud
4ACFE Report to the Nation Occupational Fraud and Abuse 2 1/2 year study of 2608 Frauds totaling $15 millionFraud costs U.S. organizations more than $400 billion annually.Fraud and abuse costs employers an average of $9 a day per employeeThe average organization loses about 6 percent of its total annual revenue to fraud and abuse admitted to by its own employees
5Ernst & Young Fraud Study 2002 (Europe) One in five workers are aware of fraud in their workplace80% would be willing to turn in a colleague but only 43% haveEmployers lost 20 cents on every dollar to workplace fraudTypes of fraudTheft of office items—37%Claiming extra hours worked—16%Inflating expenses accounts—7%Taking kickbacks from suppliers—6%
6Cost of Fraud Fraud Losses Reduce Net Income $ for $ If Profit Margin is 10%, Revenues Must Increase by 10 times Losses to Recover Affect on Net IncomeLosses……. $1 MillionRevenue….$1 BillionRevenues $ %Expenses %Net Income $ %FraudRemaining $ 9To restore income to $10, need $10 more dollars of revenue to generate $1 more dollar of income.
7Fraud Cost….Two Examples Automobile Manufacturer$436 Million FraudProfit Margin = 10%$4.36 Billion in Revenues NeededAt $20,000 per Car, 218,000 CarsLarge Bank$100 Million FraudProfit Margin = 10 %$1 Billion in Revenues NeededAt $100 per year per Checking Account, Million New Accounts
8A Recent Fraud Large Fraud of $2.6 Billion over 9 years Year 1 $600KYear 3 $4 millionYear 5 $80 millionYear 7 $600 millionYear 9 $2.6 billionIn years 8 and 9, four of the world’s largest banks were involved and lost over $500 millionSome of the organizations involved: Merrill Lynch, Chase, J.P. Morgan,Union Bank of Switzerland, Credit Lynnaise, Sumitomo, and others.
9Every Person Has A Price Abraham Lincoln once threw a man out of his office, angrily turning down a substantial bribe. “Every man has his price”, explained Lincoln, “and he was getting close to mine.”
11Workers were logging hours on two timecards for simultaneous jobs Superhuman WorkersSummed all hours (normal, OT, DT) per two week period, regardless of invoice or timecard)Workers were logging hours on two timecards for simultaneous jobsOne search summed all hours worked by employees within two week periods. It ignored which project it was on, which plant it was at, what type of work it was, etc.We found people that were working over 100 hours per week. This could perceivably happen once or twice, but many workers did this consistently, month after month (as seen in the trend above).Investigation into these employees showed that they were clocking in under two time cards at different locations in the plant, effectively doubling their hours each week.
12Work Orders Authorized By Purchaser The Family BusinessThe next few slides show the results of a specialized search. We stratefied the data by the amount of work orders that purchasers authorized during each period. As can be seen, purchaser F authorized considerably more work than other purchasers.Work Orders Authorized By Purchaser
13Invoice Charges Authorized By Purchaser The Family BusinessPurchaser F is again shown in this spreadsheet, which is now stratefied by invoice charges. Again, he is authorizing considerably more charges.Invoice Charges Authorized By Purchaser
14Work Orders Given To Contractor Crew The Family BusinessThe picture became clearer as we stratefied by contractor crew. The company subcontracted with third-party companies for this type of work, and it is obvious which crew is getting the majority of the work. See the totals across the bottom.Work Orders Given To Contractor Crew
15Tip stated that kickbacks were occurring with a certain company The Family BusinessTip stated that kickbacks were occurring with a certain companyWe researched the company and determined which purchaser authorized the workA contractor crew and company purchaser were familyWhen we investigated these people on both sides of the transaction, the same last name was found on each side. The individuals came from the same immediate family, and the purchaser was funneling work to his family’s company.
16Systematic Increases In Spending These next few slides show some sample data patterns that researchers can look for. They are not all-inclusive, but are just examples of what to look for and one way to visualize it.The above time engine results show employee (with names grayed out) trends in spending. The shown trend is increasing regularly.
17Systematic Increases In Spending This slide shows another increase in spending. Note how the time engine flags the suspicious data points in red.
18Unexpected Peaks In Spending This slide shows an unexpected peak in spending. The employee had normal spending until one month where he or she spent significantly more than expected. It is important to understand why this occurred.
19Increases In Only Part Of A Trend This data pattern illustrates how subtrends need to be analyzed. A simple average (or regression equation) of this trend would be very normal. However, a problem trend is flagged when only the first five data points are considered.The time engine ran repeated analyses on all parts of a trend.
22Accounting History1940 SEC Statement: “Accountants can be expected to detect gross overstatements of assets and profits whether resulting from collusive fraud or otherwise” (Accounting Series Release 1940)1961: “If the ten (auditing) standards now accepted were satisfactory for their purpose we would not have the pleas for guidance on the extent of (auditors’) responsibility for the detection of irregularities we now find in our professional literature.” (Mautz & Sharaf 1961)SAS 82SAS 99Expectation Gap
23Historical Fraud Research Excellent literature review by Nieschwietz, Shultz, & Zimbelman (2000)Who commits fraudRed flagsExpectation gapAuditor expectationsGame theory between auditors and managementAuditor-client relationshipsRisk assessment, decision aidsManagement factors affecting fraud
24FS Fraud using Ratio Analysis Hansen, et. al (1996) developed a generalized qualitative-response model from internal sourcesGreen and Choi (1997) used neural networks to classify fraudulent casesSummers and Sweeny (1998) identified FS fraud using external and internal informationBenish (1999) developed a probit model using ratios for fraud identificationBell and Carcello (2000) developed a logistic regression model to identify fraudCurrent work by McKee and by Cecchini and by AlbrechtNone have found the “silver bullet” in using external information to identify fraudManagement (FS) fraud is very difficult to find
25Each firm seems to have different groups working on fraud detection What are the Big 4 Doing?Each firm seems to have different groups working on fraud detectionNo best practices model has emergedIT auditors perform control testing on company systems, not fraud detectionMeeting with Bill Titera of EY
26Why Don’t “They” Find Fraud? Limited timeOur most precious resource is our attentionHistoryHeavy use of sampling - lack of detailLack of historical fraud detection instructionLack of fraud symptom expertiseLack of fraud-specific toolsLack of analysis skillsLack of expertise in technologyAuditors do find percent of fraudACFE 2004 Report to the Nation
27Isn’t there a better way? Reasonable time requirementsIntegrate AI and auto-detectionWithin reach of most auditors (highly technical skills not required)Integrate easily into different database schemasCost effective
28A small “manual” about frauds Initial ThoughtsA small “manual” about fraudsCliff notes about different types of fraudDescribes the schemeDescribes the indicators of the schemeWorldwide repository wth contributions from many different industriesPrimary focus was training
29Input is one or more table objects Output is one or more table objects DetectletsA detectlet encodes:Background information on a schemeDetail on a specific indicator of the schemeWizard interface to walk the user through input selectionAlgorithm coded in standard format“How to interpret results” follow-upInput is one or more table objectsOutput is one or more table objects
30Detectlet Demonstration Bid rigging where one person prepares all bids
31Potential Supporting Platforms MS AccessACL or IDEABuild ground up applicationAllows total control over platformStays with open source rather than tying the program to a particular platformFor example, consider PowerBuilderSupports Windows, Unix, Linux, MacAllows embedded use within a greater platformPersonal preference was Python
34How Detectlets Address the Problem Limited Time: Detectlets provide a wizard interface for quick execution; they can be chained and automated into a larger systemHigh Cost: Detectlets are based in open source software, putting them within reach of small and large accounting firms; they also create a community environment for fraud detection
35How Detectlets Address the Problem Lack of fraud symptom expertise: Detectlets provide a large library of available routines to both train and walk auditors through the detection processLack of fraud-specific tools: Picalo provides an open solution that we can improve over time; it puts a fraud-specific toolkit in the hands of auditors
36How Detectlets Address the Problem Lack of analysis skills: Detectlets encode full algorithms and code, allowing the auditor to stay at the conceptual level rather than the implementation levelLack of expertise in technology: Detectlets provide a wizard-based solution that are easy to use; Picalo provides an Excel-like user interface
38Data StructuresThe Table object is the basic data structure. Nearly all routines both input and return tables, allowing them to be chained. Its methods include sorting, column operations, row operations, import/export from delimited text and Excel formats.Column types include Boolean, Integer, Floating Point, Date, DateTime, String, etc.
40Benfords Modulecalc_benford: Calculates probability for a single digitget_expected: Calculates probability for a full numberanalyze: Analyzes an entire data set and calculates summarized results
41Crosstable Module pivot: Similar to Excel’s pivot table function pivot_table: Pivots and keeps detail in each cellpivot_map: Pivots and keeps results in a dictionary rather than a gridpivot_map_detail: Pivots and keeps results in a very detailed fashion using a dictionary
42Database ModuleOdbcConnection: Connects to any ODBC-compliant databasePostgreSQLConnection: Connects to PostgreSQLMySQLConnection: Connects to MySQLAlso includes various query helper functions, such as query creation, results analysis, etc.
43Financial ModuleCalculates various financial ratios to help in financial statement analysis:Current ratioQuick ratioNet working capitalReturn on assetsReturn on equityReturn on common equityProfit marginEarnings per shareAsset turnoverInventory turnoverDebt to equityPrice earnings
44Grouping ModuleStratification gives the details behind SQL GROUP BY. It keeps the detail tables rather than summarizing them.stratify: Stratifies a table into N number of tablesstratify_by_expression: Stratifies a table using an arbitrary expressionstratify_by_value: Stratifies on unique valuesstratify_by_step: Stratifies based on a set numerical rangestratify_by_date: Stratifies based on a date rangeSummarizing is similar to SQL GROUP BY, but it allows any type of function to be used for summarization (GROUP BY generally only allows sum, stdev, mean, etc.) This can by done in the same ways as stratification.
45Trending ModuleVarious ways of analyzing trends and patterns over time.cusum, highlow_slope, average_slope, regression, handshake_slope
46Python LibrariesPowerful yet easy language with a significant online communityFull object-oriented support (classes, inheritance, etc.)Text maniuplation and analysis routinesWeb site spidering routinesanalysis routinesRandom number generationConnection to nearly all databasesWeb site development and maintenanceCountless libraries available online (almost all are open source)
48Level 1 Research Foundation routines for fraud detection Development, testing, empirical use, field studiesConnections to production softwareStandard SAP, Oracle, Peoplesoft, JD Edwards, etc. modulesApplication of CS, statistics, other techniques to fraud detectionTime series analysisPattern recognition for fraud detection
49Level 2 ResearchStudies about detectlet presentation, user interfaceCreation and testing of detectlets for industries, data schemas, etc.Detectlets for financial statement fraud detectionTesting of detectlet vs. traditional ACL-type fraud detectionPatterns of detectlet development, best practices
50Automatic mapping of field schemas to a common schema Level 3 ResearchAutomatic mapping of field schemas to a common schemaApplication of expert system, learning models for automatic detectionDecision treesClassification modelsMeta-detectlets to combine various Level 2 detectlets into higher-level logic
51Group-oriented processes for the central repository Other ResearchGroup-oriented processes for the central repositorySearching, categorizationTesting, rating systemsMarketplaces for detectletsDevelopment of Picalo itself
52In 5 years we’ll have a large repository of detectlets to: My HopeIn 5 years we’ll have a large repository of detectlets to:Support both external and internal auditorsTeach students in fraud classesConduct theoretical and empirical research