Download presentation

Presentation is loading. Please wait.

Published byIsabella Sutton Modified over 2 years ago

1
GE Proficy Historian Data Compression Introduction Stephen Friedenthal EVSystems

2
This method is used by the GE Historian What is data compression? There are two fundamental classes of file compression: There are two fundamental classes of file compression: Identify repeating elements (e.g., ZIP file compression)Identify repeating elements (e.g., ZIP file compression) Pros: No loss of information – all original data restored Pros: No loss of information – all original data restored Cons: CPU intensive – need to compress and decompress, large files take a lot of time Cons: CPU intensive – need to compress and decompress, large files take a lot of time Identify redundant data that can be discarded (e.g., JPEG, dead-band, rate-of-change)Identify redundant data that can be discarded (e.g., JPEG, dead-band, rate-of-change) Pros: Fast, reduces network traffic, well suited for streaming data Pros: Fast, reduces network traffic, well suited for streaming data Cons: Some data loss Cons: Some data loss

3
Customer quotes when I ask them about compression? Disk space is cheap. We dont want to lose any data so we store everything Todays computers are so fast theres no penalty for storing everything. Were a regulated industry…. We arent allowed to use compression. From all of the above, you might come to believe that data compression is an antiquated response to a problem that no longer exists. Computers are fast, storage is cheap, so store everything.

4
Why compression is (still) important Needle in the haystack problem Needle in the haystack problem Much more difficult to find the truly interesting data Much more difficult to find the truly interesting data Limited network bandwidth Limited network bandwidth Storing terabytes of data is only useful if you can easily extract it Storing terabytes of data is only useful if you can easily extract it High long-term costs High long-term costs Disk drives are cheap, but managing the data gets expensive Disk drives are cheap, but managing the data gets expensive Superior performance Superior performance Storing the minimum necessary data greatly increases system performance and speed for clients & servers. Storing the minimum necessary data greatly increases system performance and speed for clients & servers.

5
GE Historian Compression Methods The Proficy Historian has two forms of data compression The Proficy Historian has two forms of data compression Collector compression (CC)Also called, dead band compression. It works by examining data and discarding any that does not exceed a defined limit (e.g. +/- 0.5 Deg F.)Collector compression (CC)Also called, dead band compression. It works by examining data and discarding any that does not exceed a defined limit (e.g. +/- 0.5 Deg F.) Archive Compression (AC)Also called rate of change or swinging door compression. It works by examining data (after CC) and discarding any that falls within a slope range (more on this later.)Archive Compression (AC)Also called rate of change or swinging door compression. It works by examining data (after CC) and discarding any that falls within a slope range (more on this later.)

6
Collector Compression Dead band x x x x x x x x x x x x x x x x Discarded samples Stored sample Collector compression overview Pros: Good at filtering out noise Reduces data storage by 80 to ~90+% Easy to understand Cons: Unable to reduce data when slope (vs. value) is unchanged (see constant slope section above) Constant slope line

7
Archive Compression Archive compression looks at the data after collector compression Archive compression looks at the data after collector compression It only stores data that changes direction beyond a configured range It only stores data that changes direction beyond a configured range In effect, it stores data based on its rate of change. Compare to collector compression which stores data based on the amount of change.In effect, it stores data based on its rate of change. Compare to collector compression which stores data based on the amount of change.

8
Archive Compression Effect Discarded by archive compression Archive compression overview Pros: Can significantly reduce storage for certain signal types and noise Stores only the most relevant values Cons: More difficult to tune More difficult to understand Red values are stored Green values are discarded Large change in slope, so values is stored

9
Archive Compression –A deeper dive How does it compare to OSIs Swinging Door compression?

10
PI checks to see if all points lie inside the compression blanket, a dead band parallelogram drawn from end points using the CompDev as a tolerance. If any points fall outside the dead band, an archive event is triggered. Even though this is the point that falls outside the dead band, this is the one that gets archived because it is the last end point for which all points were inside the dead band. OSI PI Swinging Door Comrpession

11
2) Calculate y for this x. 1) Calculate slope of this line 3) Calculate difference 4) Check if ABS difference < CompDev 3) Calculate slope of lower line 1) Calculate slope of upper line 4) Calculate lower y for this x. 2) Calculate upper y for this x. 5) Check if point y is < upper y 6) Check if point y is > lower y OSI PI swinging door algorithm checks if a point is inside parallelogram. The GE Historian algorithm checks if line between end points intersects the tolerance bar. Archive Compression vs. PI

12
New Point Archived Point Swinging Door method. GE Proficy Historian Instead of checking if each point is inside the parallelogram, the GE Proficy Historian checks if the line intersects the dead band of each point. GE Archive Compression vs. PI

13
As an additional benefit, there is no need to buffer all points between the last archived point and the newest point. Heres an example of how it works. The key points to understand: An Archived Point is one that is stored A Held Point is the last good value that arrived. We dont know if it will be stored until the next value arrives to tell us if the slope has changed sufficiently. After a point is archived, the next point becomes the held point. Held Point Archived Point GE Archive Compression Example

14
Construct error bands around the held point. PI:E = CompDev GE:E = deadband / 2 Archived Point E E Held Point GE Archive Compression Example

15
Step 1:Calculate the slopes of the two lines, U and L, connecting the archived point with the upper and lower ends of the error bands (dead band) associated with the held point. Held Point Archived Point _L_L _U_U GE Archive Compression Example

16
The upper and lower slopes define a critical aperture window. Held Point Critical Aperture Window Archived Point _L_L _U_U GE Archive Compression Example

17
New Point Held Point Archived Point _L_L _U_U If the slope of the line N, connecting the archived point with the new point, is between the upper and lower slopes, it intersects the dead band of the held point. _N_N GE Archive Compression Example

18
You can forget about this point now. Remember the lowest upper slope and the highest lower slope. New Point As new points are added, the previous new point becomes the current held point, and the same process is repeated. The critical aperture window will always be constructed from the lowest upper slope and the highest lower slope to insure that the conditions necessary to compress all previous points will be preserved. If the slope of the new point is within the critical aperture window, the previous held point may be discarded. Held Point Forget the slope of this line GE Archive Compression Example

19
New Point Forget Keep Forget Held Point Forget With each new point the process is continued, narrowing the aperture and discarding unnecessary points as you go. GE Archive Compression Example

20
New Point Forget Held Point Forget Keep GE Archive Compression Example With each new point the process is continued, narrowing the aperture and discarding unnecessary points as you go.

21
New Point Forget Held Point Forget Keep With each new point the process is continued, narrowing the aperture and discarding unnecessary points as you go. If this continues long enough, the critical aperture window will close, converging on the slope of the trend for this segment. GE Archive Compression Example

22
New Point Held Point Forget Keep When the slope of the new point lies outside of the critical aperture window, an archive event is triggered. Outside critical aperture window. GE Archive Compression Example

23
Held Point The held point is now archived. The held point is archived, the new point becomes the held point and the process starts anew. Archived Point The previous new point is now the held point. GE Archive Compression Example

24
Held Point The process continues, as additional data arrive the critical aperture grows longer and thinner until a new value triggers an archive event. GE Archive Compression Example

25
23 out of 120 points archived10 out of 120 points archived This one example is very encouraging, but more statistically significant work must be done as well as a data quality assessment comparing these approaches. GE Archive Compression Example

26
Questions Stephen Friedenthal EVSystems

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google