Presentation on theme: "Data Mining Solution Using Data Mining Analytics to Support Fraud Detection in CalWORKs Stage 1 Child Care."— Presentation transcript:
1 Data Mining SolutionUsing Data Mining Analytics to Support Fraud Detection in CalWORKs Stage 1 Child Care
2 Data Mining Technology Data mining is a reiterative process of selecting, exploring and modeling large amounts of data to identify meaningful, logical patterns and relationships among key variables.Source:Data Mining 101: How to Reveal New Insights in Existing Data to Improve PerformanceInsights from a webinar in the SAS Applying Business Analytics SeriesOriginally broadcast in June 2010
3 Project BackgroundDPSS’ Data Mining Solution (DMS) is a computer application that employs pattern detection and predictive analytics to detect and prevent fraud in public assistance programs, such as our CalWORKs Stage 1 Child Care Program.A Board Motion introduced by Supervisor Antonovich on May 29, 2007, provided the Department of Public Social Services (DPSS) the vision to utilize cutting edge technology, such as data mining and predictive analytics, to ensure and maintain the integrity of the County's public assistance programs.A successful pilot was completed in 2008 which evaluated the effectiveness of using data mining technology to detect potential fraud in the CalWORKs Stage 1 Child Care Program.The DMS Agreement was approved by the Board on December 22, 2009, for SAS Institute Inc. (SAS) to design, develop and implement the data mining technology for Los Angeles County to target fraud in the CalWORKs Stage 1 Child Care Program.The DMS Application was implemented on May 2011 by DPSS to target fraud in the CalWORKs Stage 1 Child Care Program.The Board Approved an Amendment to the DMS Agreement on May 15, 2012 to extend the data mining technology to In-Home Supportive Services (IHSS) Program.
4 Project Objectives and Key Milestones DMS Application is hosted in Cary, North Carolina by SAS OnDemandThe SAS Fraud Framework tracks:CalWORKs Stage I Child Care Participants with children requesting assistance from the County;Providers that care for children while the parent or guardian go to work or school; andEmployers on record providing employment for the participants who attempt to defraud the County of Los Angeles by obtaining payment for falsified services.Using state of the art data mining techniques the DMS application:Prioritizes referrals from Alternate Payment Program agencies (APPs)Consolidates data for investigationsShows networks among providers and participantsStreamlines / optimizes searching of County data sourcesCapable of integrating with existing workflow and case management systemsDisplays results using the advanced visualization applicationUses statistically designed risk measures to predict collusion activities
5 Project Data Sources Data Preparation Efforts Historical data from 2001 to PresentData was (cleaned/matched/consolidated/geocoded) monthly from DPSS and external data sources to generate dozens of variables for data mining models including:Data Focus: Participant and ProviderCalWORKs Stage 1 Child CareChild care utilization, request and provider filesWelfare-to-Work activity tables from the GEARS systemChild care licensing filesLEADER Case and Individual tables for participant recordsData includes known cases of fraud and alleged fraudLEADER Fraud cluster tables to identify participants referred/prosecuted forChild Care and other fraud typesExternal Data SourcesState employment, employer and Income& Eligibility Verification System (IEVS) & New Hire (NHR) filesDun and Bradstreet employment fileIn-Home Supportive Services (IHSS) participant and provider files
6 Anomaly Detection Risk Assessment Operational Outcome Alert Score Rules/Pattern RecognitionAnomaly DetectionPredictive ModelHot List - (e.g., Providers & Employers with known fraudulent activity)Social Network LinkagesOperational OutcomePrioritized by Alert scoreMonthly High Risk ReportDrill-down into case detailFurther drill-down into Alert detailLaunch into other ad-hoc analysesAlert ScoreBase scorePlus or minus depending on value of components
7 Utilization Process Triage View High risk Alerts are generated based on the comparisons between CalWORKs Stage 1 Child Care cases and the typical profile of fraudulent CalWORKs casesDesignated Triage Workers (DTWs) are assigned to conduct comprehensive case reviews based on these AlertsReferrals are initiated to Welfare Fraud Prevention & Investigations (WFP&I) SectionCase action reviews result in one or more of the following outcomes: termination of benefits, overpayments, reduction in benefits, share of cost and/or fraud referralInvestigator ViewDMS provides tools and the capability to the DPSS WFP&I team to assist in their detection, prevention, and investigation of fraud in the CalWORKs Stage 1 Child Care ProgramThe Social Network Analysis allows WFP&I to identify suspicious cases for preliminary earlier investigationProvide access to participant’s 10-year historical data across Programs
8 LA County Fraud Framework – DMS Log on pageAll Users have been assigned a User ID to access the DMS Application.Log On Page
9 CASE VIEW SELECTION Case View Select DMS Case View Selection The DMS Application has two different views within the database when your User Id is an Administrative Role.Investigator Role – WFP&I Investigators will access the Investigator View to review all assigned Referrals/Investigations.Triage Role – Triage Supervisor and the Triage Designated Workers will access the Triage View to review all cases assigned a Risk Score with a Trigger Rule that indicates a likelihood of Child Care Fraud.Case View Select
10 Triage View Triage View Triage View Triage Reviewer’s see a list of cases that do not have open investigations for fraud. This list guides the Triage Reviewer in deciding which of the cases need a fraud referral. The list is sorted by the predictive model score that indicates the likelihood of child care fraud.Designated Triage Workers are assigned a set of cases to review on a monthly basis and make referrals to WFP&I if fraudulent activity is detected or they will make a District 2 Way Gram referral to Line Operations to review the case records for Overpayments and Over Issuance or Unreported Income.Trigger Count – Number of factors, identified by LA County, that indicate a high risk of fraud; examples include an excessive driving distance or a child care provider being an In-Home Supportive Services (IHSS) consumerTrigger Codes – Reasons underlying the trigger(s):A – New address every two months (on average)C – Child care received, but no child under age 13D – At least one leg of the Participant – Provider – Employer distance determination is considered excessiveH – Participant address is the DPSS office and child care is Type 1 or 3I – Child care provider is an IHSS consumerN – Child care is received, but there is no corresponding component or employmentS – Self-employment reportedAfter the detail review of the case records by the Designated Triage Workers, each worker will document the case records with notes on the Comments Tab indicating that fraudulent activity was not discovered, so the case records are Dismissed from the Triage View or document the case record that a Fraud Referral was initiated to WFP&I.Triage View
11 Investigator View Investigator View Investigator View The Investigator view displays the active investigations assigned to the Investigator File Number, with the Investigation Start Date which tracks the Statute of Limitation on prosecution of the case records with the District Attorney. The Referral Number assigned to this investigation record and any Companion cases related to the Parent case number.The Cases display provides an overview of the investigations that an investigator needs to review, including a participant’s name and identification numbers, the investigation start date, the assigned investigator, the referral number, and a Companion investigation group number.List items are sorted by Investigation Start date, the cells for which have a colored background that corresponds to the time remaining on the statute of limitations.Investigator View
12 User Interface Participant Details The Participant Detail pane provides a quick view at the participant case record information related to residential address, family members, providers, employment, source income and benefits and prior welfare fraud historical records.Participant Detail PageThe Participant Detail pane provides a quick view of a participant, with details such as age, sex, and periods of Child Care Program use. The most recent address and phone number are provided with hyperlinks to online mapping and search tools. Selecting a hyperlinked address opens a Google Map.To review details specific to a particular case, users either select a case and then select Participant Profile (in the menu bar) or double-click a selected case. The case view defaults to a profile that contains detailed information for a participant.When the profile opens, the Timeline and Provider tabs are selected by default.Participant Detail Pane contains current informationTables of information associated with the case, organized into tabs (lower tabs of the UI):Providers - Paid tabProviders - Authorized tabAddress tabEmployment tabComponent (Welfare-to-Work) tabIncome and Benefits tabDPSS Actions tabWFP&I History tabGraphs help users visualize the details of the case being reviewed (upper tabs of the UI):Timeline tab – Displays the graphical view of the case activities listed in the tabs to provide a visual look of concurrent activities in every item’s occurrence in time related to the cases details.Street map tab – Shows participant’s home, provider, and work locations on a finely detailed map. Hovering the mouse over the line connecting two nodes displays the distance between the two nodes, which indicates the travel distance between their Employment and Provider activities on a graphical travel distance viewer with identification on how many miles are driven between each point.Relationship tree tab – Shows information about family members and unrelated associates who may be in the household. Hovering the mouse over the icons will provide more detailed information for each member on the Relationship tree.Risk assessment tab – Displays the case risk score (from the predictive model) and provides specific details on key factors that are inputs to the model and others that are triggers or rules defined by LA County.Comments tab – Allows users to record notes about the case and review notes that have been posted by others.Social Network – Displays a network of connected participants, providers, employers, and phone numbers, centered on the selected participant.Participant Detail Page
13 Timeline Graph Timeline Tab The Timeline tab is a graphic representation of the data from the Provider, Address, Employment, Component, Income and Benefits, and DPSS Actions tabs This graph provides a brief, color-coded view of each data source, as well as a quick method for seeing when each event occurred.Timeline GraphTimeline tab – Displays the graphical representation of the case activities listed in the Provider, Address, Employment, Component, Income and Benefits, and DPSS Actions tabs to provide a visual look of concurrent activities and every item’s occurrence in time related to the cases details.Timeline Tab
14 Street Map View Social Network Map View Street Map View The Street Map tab displays a geographic map of all the provider service, home (residential, not mailing), and work addresses associated with a case. Users can adjust the time slider to view a specific point in time. The data for this map comes directly from the Provider, Address, and Employment tabs.Social Network Map View
15 Relationship TreeThe Relationship Tree tab shows a participant’s family, household members, and other relatives. Users can adjust the time slider to view the tree over time.Relationship TreeRelationship tree tab – Shows information about family members and unrelated associates who may be in the household. Hovering the mouse over the icons will provide more detailed information for each member on the Relationship tree.The participant appears at the top of the diagram. Lines connect the participant with known relatives. The type of the relationship is noted for each connection. Blue and pink person icons indicate the relative’s gender. Each person’s date of birth, approximate age at the time of the time slider, and Person ID (PID) is displayed. Hovering the cursor over a relationship tree member displays an information box with more detailed information (for example, undocumented flag, in home flag, on aid).Relationship trees along the time slider exist only on the last day of months in which a new relationship is added. Relationships are never removed.Relationship Tree
16 Risk AssessmentRisk Assessment tab contains a list of key fraud indicators for a participant. The tab contains two parts: the Predictive Model section and the Triggers section.Risk Assessment TabThe Predictive Model section contains the results of a statistical model. The Value for the Overall Risk component is 1000 times the expected probability of fraud.Users may see the list of Components that comprise the model. The items in Value correspond to model values in Score and add to the Base Risk Score (always 50) to determine the Overall Risk Score.The probability of fraud doubles with every 10-point increase in the Overall Risk Score. For example, a case with a score of 70 is four times as likely to be fraudulent as a case with a score of 50.Note: Model inputs may be trimmed to a fewer number in any given month without notice.Risk Assessment
17 Social Network Analysis The Social Network Analysis provides a graphical view of the Participant case record centered on the graph with all the connections within the database of other Participants, Providers, Employment activities and Phone Number connections to the CalWORKs Stage 1 Child Care Participant.Social Network AnalysisSocial Network Analysis tab - provides a graphical view of the Participant case record centered on the graph with all the connections within the database of other Participants, Providers, Employment activities and Phone Number connections to the Participant. This tool provides the WFP&I Investigators a visually look of all the connections and all the corresponding cases to the current Investigation of a case record.The Social Network Analysis functionality allows investigators and reviewers to explore the relationships among investigated consumers, their associated employers, their associated child care providers, and other consumers receiving Stage 1 Child Care.Network tool provides a zoom in and out capability to view the diagram details in more closer visual view.Tool provides the capability to group the nodes when the User has identified some kind of fraudulent activity connection between the parties.Network provides a time line graph to review historical activities of the Participant status in relation to all activities connected. The history of the relationship may be explored using the time slider function.The relationships are displayed as a social network analysis diagram, centered upon the reviewed or alerted consumer. The investigators and reviewers may expand the diagram to show relationships to other consumers and their providers and employers. Links among these entities are established by direct relationships between the entities, shared addresses, phone numbers, or other personal identifying information. Known fraudulent entities are indicated by special colorings. The location of all entities may be plotted on a map, which also varies over time.LegendSocial Network Example
18 Participant Detail Summary Report in PDF Format PDF Summary ReportAllows the Users to review all the case information on the Participant record selected and create a PDF Summary Report as a hard copy for the case file for prosecution filing.PDF Summary Report18
19 Holistic Approach to Program Integrity From May 2011 through July 2013, the following actions were initiated:* 28 Cases have been referred to the District Attorney for felony prosecution* 405 DMS fraud referrals initiated for investigationTriage-Initiated: 311WFP&I-Initiated: 94* 753 Referrals to DPSS case workers for follow up action resulting in:Fraud referrals for reasons other than child care fraudDenial/Termination/Reduction of various public assistance benefitsOverpaymentsCost AvoidanceThe Department expects DMS to result in tens of millions of dollars in cost savings/avoidance and efficiencies over the life of the project through overpayment collections, court ordered restitution, earlier fraud detection and discontinuance of associated benefits, as well as all around improvements in the fraud investigative processes for the CalWORKs Stage 1 Child Care Program.The use of the data mining technology in the CalWORKs Stage 1 Child Care Program, In-Home Supportive Services (IHSS), and other County’s public assistance programs for fraud detection and prevention is expected to result in new fraud referrals, early detection of fraud and increased efficiency, all leading to cost avoidance.The Data Mining Pilot achieved an 85 percent (85%) accuracy rate in detecting collusive fraud rings. The results of the Pilot show that the use of data mining software as a fraud detection tool would have enabled cost avoidance in three areas:(1) New fraud referrals, resulting in an annual gross cost avoidance of at least $2.2 million;(2) Early detection of fraud, resulting in an annual gross cost avoidance of $1.6 million;(3) Increased efficiency, resulting in an annual gross cost avoidance of $3 million.The total annual gross cost avoidance in these areas would, therefore, have been at least $6.8 million. Furthermore, the results indicated that the cost avoidance could possibly increase with additional data sources and further utilization of additional predictive fraud detection models not included in the Pilot.The potential exists for additional cost avoidance, as the use of DMS technology is expanded to other public assistance programs.
Your consent to our cookies if you continue to use this website.