Presentation on theme: "GE Proficy Historian Data Compression"— Presentation transcript:
1GE Proficy Historian Data Compression IntroductionStephen FriedenthalEVSystems
2What is data compression? There are two fundamental classes of file compression:Identify repeating elements (e.g., ZIP file compression)Pros: No loss of information – all original data restoredCons: CPU intensive – need to compress and decompress, large files take a lot of timeIdentify redundant data that can be discarded (e.g., JPEG, dead-band, rate-of-change)Pros: Fast, reduces network traffic, well suited for streaming dataCons: Some data lossThis method is used by the GE Historian
3Customer quotes when I ask them about compression? “Disk space is cheap.”“We don’t want to lose any data so we store everything”“Today’s computers are so fast there’s no penalty for storing everything.”“We’re a regulated industry…. We aren’t allowed to use compression.”From all of the above, you might come to believe that data compression is an antiquated response to a problem that no longer exists.Computers are fast, storage is cheap, so store everything.
4Why compression is (still) important “Needle in the haystack” problemMuch more difficult to find the truly interesting dataLimited network bandwidthStoring terabytes of data is only useful if you can easily extract itHigh long-term costsDisk drives are “cheap”, but managing the data gets expensiveSuperior performanceStoring the minimum necessary data greatly increases system performance and speed for clients & servers.
5GE Historian Compression Methods The Proficy Historian has two forms of data compression”Collector compression (CC)—Also called, “dead band” compression. It works by examining data and discarding any that does not exceed a defined limit (e.g. +/- 0.5 Deg F.)Archive Compression (AC)—Also called “rate of change” or “swinging door” compression. It works by examining data (after CC) and discarding any that falls within a slope range (more on this later.)
6Collector Compression Dead bandxDiscarded samplesStored sampleConstant slope lineCollector compression overviewPros:Good at filtering out noiseReduces data storage by 80 to ~90+%Easy to understandCons:Unable to reduce data when slope (vs. value) is unchanged (see constant slope section above)
7Archive CompressionArchive compression looks at the data after collector compressionIt only stores data that “changes direction” beyond a configured rangeIn effect, it stores data based on its rate of change. Compare to collector compression which stores data based on the amount of change.
8Archive Compression Effect Red values are storedGreen values are discardedLarge change in slope, so values is storedDiscarded by archive compressionArchive compression overviewPros:Can significantly reduce storage for certain signal types and noiseStores only the most relevant valuesCons:More difficult to tuneMore difficult to understand
9Archive Compression –A deeper dive How does it compare to OSI’s Swinging Door compression?
10OSI PI Swinging Door Comrpession PI checks to see if all points lie inside the compression blanket, a dead band parallelogram drawn from end points using the CompDev as a tolerance. If any points fall outside the dead band, an archive event is triggered.Even though this is the point that falls outside the dead band, this is the one that gets archived because it is the last end point for which all points were inside the dead band.
11Archive Compression vs. PI OSI PI swinging door algorithm checks if a point is inside parallelogram.The GE Historian algorithm checks if line between end points intersects the tolerance bar.2) Calculate upper y for this x.4) Check if ABS difference < CompDev1) Calculate slope of upper line5) Check if point y is < upper y3) Calculate difference6) Check if point y is > lower y2) Calculate y for this x.1) Calculate slope of this line4) Calculate lower y for this x.3) Calculate slope of lower line
12GE Archive Compression vs. PI New PointArchived PointSwinging Door method.Instead of checking if each point is inside the parallelogram, the GE Proficy Historian checks if the line intersects the dead band of each point.GE Proficy HistorianNew PointArchived Point
13GE Archive Compression Example As an additional benefit, there is no need to buffer all points between the last archived point and the newest point.Here’s an example of how it works. The key points to understand:An “Archived Point” is one that is storedA “Held Point” is the last good value that arrived. We don’t know if it will be stored until the next value arrives to tell us if the slope has changed sufficiently.Held PointArchived PointAfter a point is archived, the next point becomes the held point.
14GE Archive Compression Example Construct error bands around the held point.PI: E = “CompDev”GE: E = deadband / 2EEArchived PointHeld Point
15GE Archive Compression Example Step 1: Calculate the slopes of the two lines, U and L, connecting the archived point with the upper and lower ends of the error bands (dead band) associated with the held point._UArchived Point_LHeld Point
16GE Archive Compression Example The upper and lower slopes define a critical aperture window.Critical Aperture Window_UArchived Point_LHeld Point
17GE Archive Compression Example If the slope of the line N, connecting the archived point with the new point, is between the upper and lower slopes, it intersects the dead band of the held point._U_NNew PointArchived Point_LHeld Point
18GE Archive Compression Example As new points are added, the previous new point becomes the current held point, and the same process is repeated.The critical aperture window will always be constructed from the lowest upper slope and the highest lower slope to insure that the conditions necessary to compress all previous points will be preserved.If the slope of the new point is within the critical aperture window, the previous held point may be discarded.You can forget about this point now.Forget the slope of this lineNew PointRemember the lowest upper slope and the highest lower slope.Held PointForget the slope of this line
19GE Archive Compression Example With each new point the process is continued, narrowing the aperture and discarding unnecessary points as you go.ForgetForgetNew PointKeepHeld PointForget
20GE Archive Compression Example With each new point the process is continued, narrowing the aperture and discarding unnecessary points as you go.KeepForgetForgetNew PointHeld PointForget
21GE Archive Compression Example With each new point the process is continued, narrowing the aperture and discarding unnecessary points as you go.If this continues long enough, the critical aperture window will close, converging on the slope of the trend for this segment.KeepForgetForgetNew PointHeld PointForget
22GE Archive Compression Example When the slope of the new point lies outside of the critical aperture window, an archive event is triggered.KeepOutside critical aperture window.ForgetNew PointForgetHeld PointForget
23GE Archive Compression Example The held point is archived, the new point becomes the held point and the process starts anew.The previous new point is now the held point.Held PointArchived PointThe held point is now archived.
24GE Archive Compression Example The process continues, as additional data arrive the critical aperture grows longer and thinner until a new value triggers an archive event.Held Point
25GE Archive Compression Example This one example is very encouraging, but more statistically significant work must be done as well as a data quality assessment comparing these approaches.23 out of 120 points archived10 out of 120 points archived