Presentation on theme: "Reading for Fun – Official History"— Presentation transcript:
1Reading for Fun – Official History VistA*/U.S. Department of Veterans Affairs national-scale HISSteven H. Brown, Michael J. Lincoln, Peter J. Groen, Robert M. KolodnerInternational Journal of Medical Informatics 69 (2003) 135/156VistA Document Library (VDL)www4.va.gov/vdl
2The VA Data Lifecycle - Internals, Data Flows, and Business Intelligence (With Pharmacy Additions)Richard Pham, PharmDEnterprise ArchitectOI&T Corporate Data Warehouse – Architecture
3The Health Care System Is MUCH MORE COMPLICATED Than Other Business Processes (~7% of VistA) Steve Anderson and Dan Hardan conversation about how ridiculously complicated BI would work from ANY medical setting.
6DHCP/VistA/CPRSVistA – Veterans Health Information Systems and Technology Architecture - Refers both to the architecture and the database which the architecture supportsDHCP - Decentralized Hospital Computer Program – The DOS (Unix-like) system where many of VistA’s non-clinical entries take placeCPRS - Computerized Provider Record System – A user-friendly GUI providing access to clinical order entry functions
7ObjectivesThe main objective is to understand the data lifecycle of VA’s VistA/CPRS and the user experience of VistA/CPRSA high-level overview of VistA InternalsLearn about data structures and outputs in VistALearn where data enters and travels throughout the VATry to make sense of data resources within the VA and how they are accessed
10Core Patient Care Functionality VistA is first and foremost an Electronic Medical Record. The architecture design supports veteran health care.
11Core Patient Care Functionality VistA InternalsDHCPCPRS
12VistA Internals 101 MUMPS Server and Operating System Kernel “Three Wise Men (Managers)”TaskManMailManFileManModules
13Massachusetts General Hospital Utility Multi-Programming System (MUMPS or M) My definition in English: M is a programming language designed for hierarchical databases that is convenient for medical applications or anything else where speed and data storage upkeep are a problem and programmer intelligence/organization is notMy technical definition: M is a Turing-complete, low and high-level, imperative, machine-compiled (no longer interpreted) programming language utilizing a hierarchical global array file structureUsed commonly in healthcare and financial industry settingsTuring-Complete: GOTO, IF/THEN, WHILE, Peano ArithmeticLow-level – DSM exposes VAX/VMS architectureHigh-level – Programming in the kernel environment is hardware and OS-agnosticMachine-compiled – All modern implementations are compiled, not interpreted (see compiler choices in the Kernel documentation)
14Structure of The Veterans Administration Data Efforts (Late 1970s) VHA AncestorDepartment of Medicine and Surgery (DMAS)OI&T AncestorOffice of Data Management & Telecommunications (ODM&T)VHA-OI AncestorComputer Assisted System Staff (CASS)
15Comparing The Two Offices CASSODM&TDecentralized design philosophyRapid, agile developmentSME-involved developmentCentralized design philosophyBureaucratic, process-focused developmentDevelopment without SME’s
16Highlights of ODM&T Development Took 6 years to deploy APPLES Pharmacy at 10 sitesA 1980 paper detailing ODM&T’s transactional patient treatment file (PTF) system promised an interactive national solution by 1990.Navigating the mandated 17 steps between system specification and deployment alone is said to have required at least 3 years.
17Beginnings of DHCPThere were subject matter experts that believed that they could put out useful applications faster than the ODM&T slothDevelopment of the testing and principles was done unofficially throughout the late 1970sThe Second
18Original DHCP Design Principles A commitment to rapid prototype developmentAll use ANSI MUMPSModular DesignActively Maintained Data DictionaryCode Sharing/PortabilityInvolve the SME’s
19DHCP KernelFunctions as both an operating system for VistA applications and an M virtual machineKernel shields DHCP modules from needing to know hardware and OS configurations on the serverIsolates M to the ANSI standard (1995)Provides a toolbox of standard functions for most programmers
20VistA to Relational Database Terminology VistA (Example)Relational Database (Example)Namespace (VHAFRE)Database (VA Fresno)“Package” – Not hardcodedSchema (RxOutpatient)File (50.68 – VA PRODUCT)Table (NationalDrug)Field (.01 – NAME)Column (DrugNameWithDose)Domain (cardinal/decimal, setofcodes, freetext/wordprocessing)Field Type (numeric, boolean, varchar)Internal Entry Number (IEN or .001)~Key (9722)RecordTuple/Row (ISOSORBIDE MONONITRATE 120MG TAB,SA)
21MUMPS Classic Database One Data TypeString (Text)Other typesCardinal NumbersFloat Numbers$H DatesOne Data Storage TypeMultidimensional Array aka GlobalsDynamic (duck) typingThe logical database of a GT.M process consists of one or more global variable name spaces, each consisting of unlimited number of global variables. For each global variable name space, a global directory maps global variables to the database files where they actually reside. An unlimited number of global variables can fit within one database file; a global variable must fit in one database file.A database file consists of up to 224M (276,168,704) database blocks. A database block is a multiple of 512 bytes, with a maximum size of 65,024 bytes. Commonly used block sizes are 4KB, 8KB and 16KB - so, with an 8KB block size, an individual global variable can grow to 1,792GB. A global variable node (global variable, subscripts plus value) must fit in one database block and each block has a 16 byte overhead. So, the largest node that will fit in a database with a 4KB block size is 4,080 bytes. A key (global variable plus subscripts) can be up to 255 bytes.The database engine is daemonless and processes accessing the database operate with normal user and group ids - a process has access to a database file if and only if the ownership and permissions of that database file (plus any layered access control such as SELinux permits access). Each process has within its address space all the logic needed to manage the database, and processes cooperate with one another to manage database files. When a database file is journaled, updates are written to journal files before being written to database files, and in the event of a system crash, database files can be recovered from journal files.
22VistA Data Organization NamespaceFileFieldRecord654 (VAMC Reno)File (GMR Vitals)Field 0.1 (DATE/TIME VITALS TAKEN)IEN-1, BP, 140/90Most Files have an entry at the Field called “IEN” or “Internal Entry Number” as an identity key to mark the record as uniqueThe equivalent of Australian fishing
23Upside of Using Globals Faster - No joinsFaster – All parameter pointers built inFaster – Direct and planned programmatic access to database (Look at SQL execution plans)Less Data Storage Overhead and faster paging – If the data point does not exist in the array, there does not need to be a fixed point like in relational
24Downside of Using Globals No Intrinsic Structure and No Enforcement* - M believes whatever you put into the globals (most M programmers view this as an advantage while relational programmers have an MI)ACID-compliance not mandated(Il)logical data structures guaranteed – There are many interesting* ways that the M programmers modeled the data that does not make sense to later viewers(PCMM, Lab, Pharmacy, anyone?)“Like a puppy, M accepts your love and hate unconditionally and gives back unconditionally too…” - VistA programmer3NF gives constraint to what you should build in a relational form.If you did something wrong, well, M accepts it anyway.
25MUMPS Quirks Whitespace (Space) matters Requires knowledge of kernel and sometimes lower-level conceptsProgramming Without Type or Structure EnforcementVA programming standards and conventions(Assembly is a better prerequisite than other languages)(The VAX/VMS version of M will let you fool around with hardware calls, bypassing the OS)(2 spaces after Q…)
27The Three Wise Men (Managers) TaskMan – The man(anger) that schedules tasks to the kernelMailMan – The man(anger) that messages between the user, TaskMan, and any other two-way communication between packagesFileMan – The man(anager) that controls internal file (data structure) interactions
28TaskMan TaskMan handles application processing: Creation of application processing tasksScheduling these tasksMonitoring health/statistics of these tasksIf kernel is the brain, then TaskMan is the body of the operationIf programming, NEVER EVER use the TaskMan global. This subverts TaskMan’s scheduling queue, and can cause a system memory leak. Use the calls instead…
29MailManVistA needs a way to pass and receive data from the database to other areasMailMan fulfills this function in the pre-TCP/IP days“Electronic mail” doesn’t mean justPractically any message between the database and anyone else (the end-user, another site, or application, etc.) can be moved this wayGives programmers methods to both receive and return data to the database
30FileManA higher-level method to access the VistA database without exposing a programmer interfaceMostly menu-drivenOne can use limited programmingServes as the model for all other modules that interact with the VistA database
31ODM&T Initial Action Plan To DHCP Development (1980) Ordered that development stopFired the developersRemoved the hardwareCut the DMAS budget so it would never happen again…
33Development Goes Underground Developers that survived the ODM&T purge continued their work as a black project in DMASDuring 1980 and 1981, the survivors (Underground Railroad) continued work on developing modules for system integration
34Modules Modules are programmed to interact with the VistA database Most use Fileman as a model for programming
35Some of the Many Modules MedicineSurgeryDentistryNursingPharmacyLaboratoryCare ManagementPatient Care EncountersADTMental HealthEDISOncologyNutrition and Food ServiceImaging/PACSProstheticsNot really in the scope of this presentation to cover each module .Try the VistA Documentation Library:OrVHA eHealth University (VeHU):
36Acceptance and DHCP 1.0Once there was a critical mass of packages that were shown to be useful, the tide turned and the project was blessed…Initial testing done in1.0 installation was in 1985Most of the underlying packages can still be recognized by the original programmersAnd if you’re awesome like Michael Distaso, then you can even get a class of API’s named after you.
37Special Topic – The Pharmacy Package (File 50 Series) Where are the files?File 50 Series is the main line, though there are other placesHow many files are there?Look at screen capture on next side for File 50 series.How many columns?
38How Many File 50 Tables Are There? (1096 – 659 = 437 Tables!!)
39How Many File 50 Series Columns Are There? (8526 – 5352 = 3172!!)
40Further Information On The Background For the VA Base M TrainingFor the VA Programming Standards and ConventionsFor the VA Document Library
41Computerized Patient Record System (CPRS) A Real-Time Order Checking System that alerts clinicians during the ordering session that a possible problem could exist if the order is processedA Notification System that immediately alerts clinicians about clinically significant eventsA Patient Posting System, displayed on every CPRS screen, that alerts clinicians to issues related specifically to the patient, including crisis notes, warning, adverse reactions, and advance directivesThe Clinical Reminder System, which allows caregivers to track and improve preventive health care for patients and ensure timely clinical interventions are initiatedRemote Data View functionality that allows clinicians to view a patient’s medical history from other VA facilities to ensure the clinician has access to all clinically relevant data available at VA facilitiesCPRS DOES NOT STORE DATA!!!
42CPRS Internals Written in Embarcadero Delphi (NOT in MUMPS) Connects from the Graphic User Interface to the VistA database using a Remote Procedure Call (RPC) BrokerThis Remote Procedure Call Broker translates instruction sets from other languages into M
43Present State of VistA Large MUMPS database Many processes Over 50+ Main Clinical PackagesOver 10,000 + TablesEach medical center runs somewhere between 2-4 TB worth of data over 30 years (mostly imaging)Many processes300+ MB of running executable at any given timeOver 20,000 subroutines (VDL)Many simultaneous users
47Systems to Support Planning Decision Support System (DSS)Supports accounting and costing for the OIG, GAO, CBO, and other auditing agenciesAllocation Resource CenterSupports personnel and resource allocation at the medical center levelWorkload capture, resource allocationBasis for the VERA (VA’s Fund Control Point) Model
48VistA DSS Data Feeds ADM Admissions CLI Clinic Visits DEN Dental ECS Event Capture SystemIVP Pharmacy: IV LAB LabLAR Lab Results MOV Patient Movement – InpatientMTL Mental Health Test NUR NursingPAS Patient Assessment PRE Pharmacy OPPRO Prosthetics RAD Radiology/Nuclear MedSUR Surgery TRT Patient Treatment SpecialtyUDP Pharmacy Unit Dose IPDeborah Richardson’s SlidesFrom VistASCHEDULE is primary purpose of slideDSS processing month is 2 months behind -- give exampleHandout will be provided with schedule of all extracts referenced in this briefing.Can also be downloaded for the NDS websiteMay be transmitted by site at any time of the month, ideallyaround the 25th of the month prior to the processing month.
49Systems to Support Research National Patient Care DatabaseAn integrated set of data that captures a patient’s care encounter with the VACorporate Data Warehouse –A near real-time accumulation of much of the same dataThe result of the Health Data Repository process
50NPCD Data Flow Diagram z900 NPCD DMI VistAMailManNPCD data is sent from the facilities to the AAC via MailMan messagingOnce a message reaches the AAC MailMan server,It automatically moves to the Data ManagementInterface System (DMI)Acknowledgement messageMailManMessageNPCD and other applications retrieve their respectivedata from DMI for useAcknowledgement messages are sent to facilitiesz900Data extracted &backed up nightly M-FData extractedby applicationAustinMailManServerMention process similarities with DSS and other systemsProcess originates with a VistA MailMan messageAustin MailMan server receives messagesMessage is sent to DMIData goes into predefined tables or queues within DMIApplications must retrieve the data -- DMI does not push data out to the apps.HL7 data are sent to the NPCD Oracle DatabaseData is backed up in DMI and within the applicationAcknowledgement Reports (or OPC) sent daily using reverse processData StreamDMINPCDAcknowledgement messageAcknowledgement messageData receivedin DMI 24x7HL7 data to Oracle DB
51NPCD Processing Daily Data Loading SAS UNIX z900 (MAINFRAME) WINDOWS Oracle on UnixNPCDUNIXMasterExtract File(MEF)SASz900 (MAINFRAME)VSSC/KLF MenuWINDOWSNational Patient Care Database Processing begins with daily data loadsFlat files are indexed and loaded into the Oracle database database dailyThe Master Extract is a weekly process performed to get data out of the database and prepare it to go into the SAS data setsSAS Data Sets are extracted approximately every two weeks from the NPCD databaseOnce data is in the SAS data sets it is available for analysisSAS Data Sets are stored on the AAC mainframe by fiscal yearAre available for use by the KLF Menu Reporting ToolProcess takes 1 to 2 weeks depending on when data arrives at the AACAll NPCD data processes are HUGEAs year progresses it takes longer to go through processAnother layer of complexity is added by data being routed through multiple environments including Unix, Mainframe, and WindowsFlat files are indexed and loaded into the database dailyDSSdataextractedData is checked for duplicates bi-monthlyData is extracted and filtered for reporting twice a month
52Resources for Further Information VA Information Resource Center (ViREC)National Patient Care Database –(Internal)National Data Systems (NDS)(Internal)
53Secrets of the VA Data Universe This was an extremely brief introduction to a complicated areaI have another presentation on the availability of databases in the VA and how to access them for operations and/or research
55Regional Remote Data Processing Center Shadow Systems A offsite backup process to ensure continuity of operations for VistA Patient Care
56Regional Data Processing Centers (RDPCs) Read only backup VistA systems are set up to take journaling filesWhen a record is written or altered to a local medical center’s VistA, a journal file with that entry is prepared and sent to a Regional Data Processing CenterThis maintains an active backup in case the local medical center’s VistA goes downVistA goes downOpenVMS goes down
57Regions and RDPCs Region I RDPC – Sacramento (SAC) and Denver (DEN) Region IV RDPC – Philadelphia (PHI) and BrooklynVISNs are numbered from the Northeast to the SouthwestRegions are numbered from the West to the East
59Northern Cal Without VistA The medical staff was forced to write discharge instructions and notes on paper.The electronic lists of instructions and of medications were not available for the patients being discharged.Patients being discharged could not be given follow-up appointments at the time of discharge.The appointments had to be made later and the patient notified by phone.There were delays in obtaining discharge medications and patients remained on the wards longer than would normally be required.The nurses administered medications to the patients and used the paper Medication Administration Record, or MAR, to record the administration events.Initial medication passes were interrupted and delayed until the paper copies of the (MAR) could be printed.
63Business Intelligence in the VA – Making the Data Work For Us VistA has a wealth of clinical and administrative data availableIn the past, giving a value-added, timely VistA dataset was hardQuerying the active system with minimal impactNeeded an interface between M and analyst languages (SAS, SQL, etc.)Easy to read reports was hard to build
65Corporate/Regional Data Warehouse Takes a copy of the journal file that goes into the backup shadow systemTranslated from the M array to a relational database format using Intersystems Cache’s class mapping programStaged in a Feeder-Collector system for collectionIndexed and value-added columns produced and loaded to an VISN RDW Server
66VHA Business Owners/SME’s Corporate Data Warehouse CDW GovernanceCommunicatesOrganizational PrioritiesVHA Business Owners/SME’s10N, OIA, VBACDW Governance BoardSets and monitors domain, work priorities, and timelines for completion.Organizes SMEs and Data StewardsVHA-OIData QualityOI&TCorporate Data WarehouseProvides Documentation and Clarification ofBusiness Logic
67CDW Governance Is In VHA’s Hands Ordered By VHADomain and Work Prioritization By CDW Governance BoardChair – KLF (OIA)Vice-Chair – Larry Mole (Public Health SHG)Monitored and Accountable To VHAProject management provided by John Quinn (National Data Systems) and KLF (OIA)Supported By VHAOI Data QualityBusiness OwnersPBM’s Data Steward is Rob Silverman
68“As the number of eyes goes up, the number of bugs goes down.” Writing documentation about the business logic of the files and fieldsAnswering end user questions about the dataData validationPreferably beforeInpatient PharmacyADR/Allergy Package
691st category models are simple – V Health Factor Source MappingFMFileFMFieldResolveFldDWTableNameDWFieldNameV HEALTH FACTORSHEALTH FACTORHealthFactorHealthFactorTypeIEN0.01HealthFactorTypePATIENT NAMEPatientIENEVENT DATE AND TIMEEventDateTimeVISITVisitVistaDateVisitDateTimeLEVEL/SEVERITYLevelSeverityVisitIENENCOUNTER PROVIDEREncounterStaffIENCOMMENTSComments
713rd category models not usable without transformation - PCMM
72Levels of Data National – Corporate Data Warehouse (CDW) Region – Regional Data Warehouse (RDW)VISN – VISN Data Warehouse (VDW)Medical Center – Local Data
73Entities Who Produce Business Intelligence Products National – VSSC, PSSG, DMDC, HEC, ARC, DSS, BIPL, OQP, PCS, PBMRegion – Regional BISL TeamsVISN – VISN Data Warehouse, VISN PBMLocal – DSSBolded are ones that have substantial resources in clinical business intelligencePSSG handles much of the GIS and Statistical Demography for the VAPSSG- Planning Systems Support GroupVSSC - VHA Support Service CenterDSS – Decision Support SystemBIPL – OI&T Business Intelligence Product LineOQP – Office of Quality and Performance (10Q)PCS – Patient Care Services (10A)PBM – Pharmacy Benefits Management Services
74Data AccessVISN and Station Level – Contact Your VISN Database ManagerRegional/Corporate Access – Contact NDS for the 9957 Permissions
75Operational Challenges of VistA System Resources$8 Billion investment over 20 yearsNew needs for new domainsMUMPS Programmers must be internally trained (and many of them are retiring or dying)Communication with Other SystemsHIMISS compliance with data interchangeE-functions (billing, prescribing, verification)Interagency Cooperation – DoD and NHINBusiness IntelligenceClosing the data lifecycle and bringing back clinical data for knowledge discovery
76Why Is Pharmacy ALWAYS Picked As An IT Test Case Project? Pharmacy is data savvyLocal data quality control from ADPACsFederated data quality control from VISN PBM, CMOP, PBM SHGPharmacy is one of the few domains that have active business logic SME’sPharmacy did not contract institutional memory out the doorPharmacy gets things donePharmacy has more technology success stories as a collective than any other PCS office (BCMA, Automation, Central Fills)Pharmacy actively mines data
77Acknowledgments Kernel – Jack Schram (Oakland OIFO) SQLI – Ellen Zufall (SF IRMS)RPC Broker and MUMPS coding – Perry Richmond (VISN 18 BI)Regional Data Process – Vincent Bui and Ken Koenig (Region I SQL Back Office Team)
78Acknowledgments OI&T Business Intelligence Product Line (BISL) Jack Bates – Manager, OI&T BIPLStephen Anderson – Lead Data ArchitectMike Baker – Lead ETL ArchitectDenver Griffith/Ken Fuchsel – Server AdministratorsDave FacklerRon TalmageDan Hardan, Jeff King, Jeff Price