Data integration in real time (pvmanager, graphene and all of that) Gabriele Carcassi.

Slides:



Advertisements
Similar presentations
Connecting to Databases. relational databases tables and relations accessed using SQL database -specific functionality –transaction processing commit.
Advertisements

Use of the SPSSMR Data Model at ATP 12 January 2004.
COM vs. CORBA.
1 1999/Ph 514: Channel Access Concepts EPICS Channel Access Concepts Bob Dalesio LANL.
EPICS V4/areaDetector Integration
EPICS Base R and beyond Andrew Johnson Computer Scientist, AES Controls Group.
GEN2 Touch: Version GEN3: Version
CSS Developments at BNL / NSLS-II Gabriele Carcassi Feb
AccelUtils Gabriele Carcassi, Kunal Shroff – BNL Eric Berryman, Robert Gaul – MSU Ralph Lange – HZB.
UNDERSTANDING JAVA APIS FOR MOBILE DEVICES v0.01.
Intro to Threading CS221 – 4/20/09. What we’ll cover today Finish the DOTS program Introduction to threads and multi-threading.
ExtJS 4.0 JavaScript MVC Framework. Who ExtJS is provided by Sencha ( o Sencha Touch o GWT o CSS Animator o IO (Cloud Data Management)
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Microsoft ASP.NET AJAX - AJAX as it has to be Presented by : Rana Vijayasimha Nalla CSCE Grad Student.
Computer Science 101 Web Access to Databases Overview of Web Access to Databases.
Olog Kunal Shroff Eric Berryman Dejan Dežman Arman Arkilic.
Advance Computer Programming Java Database Connectivity (JDBC) – In order to connect a Java application to a database, you need to use a JDBC driver. –
Jordan Maxwell ADVANCED PROGRAMMING. DEFINITIONS PHP: A server side Programming language often used in websites. API: ( Application programming interface.
@2011 Mihail L. Sichitiu1 Android Introduction Platform Overview.
Tools Strategy for BNL and MSU Gabriele Carcassi - BNL Eric Berryman - MSU.
Developments in CS-Studio, Pvmanager and Graphene Gabriele Carcassi.
COM vs. CORBA Computer Science at Azusa Pacific University September 19, 2015 Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department.
Tools and Services at NSLSII Kunal Shroff, Tasha Summers, Smith Reid, Gabriele Carcassi, Michael Davidsaver (NSLSII) Ralph Lange (ITER) Samuel Dallstream.
London April 2005 London April 2005 Creating Eyeblaster Ads The Rich Media Platform The Rich Media Platform Eyeblaster.
Imperial College Tracker Slow Control & Monitoring.
LiveCycle Data Services Introduction Part 2. Part 2? This is the second in our series on LiveCycle Data Services. If you missed our first presentation,
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
XRules An XML Business Rules Language Introduction Copyright © Waleed Abdulla All rights reserved. August 2004.
1/15 G. Manduchi EPICS Collaboration Meeting, Aix-en-Provence, Spring 2010 INTEGRATION OF EPICS AND MDSplus G. Manduchi, A. Luchetta, C. Taliercio, R.
Jan Hatje, DESY CSS ITER March 2009: Technology and Interfaces XFEL The European X-Ray Laser Project X-Ray Free-Electron Laser 1 CSS – Control.
MASAR Service Guobao Shen Photon Sciences Department Brookhaven National Laboratory EPICS Collaboration Workshop Oct 05, 2013.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
Managed by UT-Battelle for the Department of Energy Kay Kasemir ORNL/SNS Jan Control System Studio, CSS Overview.
General Time Update David Thompson Epics Collaboration Meeting June 14, 2006.
1 5 Nov 2002 Risto Pohjonen, Juha-Pekka Tolvanen MetaCase Consulting AUTOMATED PRODUCTION OF FAMILY MEMBERS: LESSONS LEARNED.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
(Java) CA Client Libraries Status Matej Šekoranja
3.14 Work List IOC Core Channel Access. Changes to IOC Core Online add/delete of record instances Tool to support online add/delete OS independent layer.
Integrating EPICS and LabVIEW on Windows using DCOM Freddie Akeroyd ISIS Computing Group.
CSS – Control System Studio
DIIRT distribution services: Web PODS - PVA PODS Gabriele Carcassi.
NCBI Genome Workbench Chuong Huynh NIH/NLM/NCBI Sao Paulo, Brasil July 15, 2004 Slides from Michael Dicuccio’s Genome Workbench.
EPICS Release 3.15 Bob Dalesio May 19, Features for 3.15 Support for large arrays - done for rsrv in 3.14 Channel access priorities - planned to.
Diirt status data integration in real time Gabriele Carcassi.
Plug-in Architectures Presented by Truc Nguyen. What’s a plug-in? “a type of program that tightly integrates with a larger application to add a special.
Jan Hatje, DESY CSS GSI Feb. 2009: Technology and Interfaces XFEL The European X-Ray Laser Project X-Ray Free-Electron Laser 1 CSS – Control.
Jan Hatje, DESY CSS – Control System Studio EPICS collaboration meeting CSS – Control System Studio Update EPICS collaboration meeting 2008 Shanghai.
Matthias Clausen, Jan Hatje, DESY CSS Overview – Alarm System and Management CSS Overview - GSI, 11 Februrary CSS Overview Alarm System and CSS.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Core Java Client Technologies Gabriele Carcassi - BNL.
1 Channel Access Concepts – IHEP EPICS Training – K.F – Aug EPICS Channel Access Concepts Kazuro Furukawa, KEK (Bob Dalesio, LANL)
Troubleshooting Workflow 8 Raymond Cruz, Software Support Engineer.
ChannelFinder & CSS Kunal Shroff EPICS Fall Collaboration Meeting, October 2011 PSI.
/16 Final Project Report By Facializer Team Final Project Report Eagle, Leo, Bessie, Five, Evan Dan, Kyle, Ben, Caleb.
Ch. 31 Q and A IS 333 Spring 2016 Victor Norman. SNMP, MIBs, and ASN.1 SNMP defines the protocol used to send requests and get responses. MIBs are like.
Lecture 1 Page 1 CS 111 Summer 2013 Important OS Properties For real operating systems built and used by real people Differs depending on who you are talking.
SDN controllers App Network elements has two components: OpenFlow client, forwarding hardware with flow tables. The SDN controller must implement the network.
AccelUtils Gabriele Carcassi, Kunal Shroff – BNL
z/Ware 2.0 Technical Overview
Google Web Toolkit Tutorial
Say Hello to my Little Friend - Fedora Messaging Infrastructure
Introduction Enosis Learning.
Storage Virtualization
EPICS Version 4 Abstract:
Introduction Enosis Learning.
Channel Access Concepts
NICOS – IBEX Interactions
Plug-In Architecture Pattern
Presentation transcript:

Data integration in real time (pvmanager, graphene and all of that) Gabriele Carcassi

The problem Different domains (v3, v4, DISCS, CS-Studio, …) produce and consume data How do we integrate that data? – How does CS-Studio gets CA and PVA data? – How do we integrate archive data with live data? – How do web services get CA data? – How does v4 get web service data? – How does MATLAB gets PVA and web service data? – How does CS-Studio gets results from a script? – ….

Objective of the talk Discuss strategically how to reshape pvmanager/epics-util/graphene to achieve that goal Discuss tactically what our priorities are – Efficient use of the resources we have Please, interrupt!

HIGH LEVEL VIEW

Pluggable datasources Source rate Desired rate Client Rate throttling: <= request <= the client can handle Caching or queuing

Pluggable datasources Formula language Source rate Desired rate Client Rate throttling: <= request <= the client can handle Caching or queuing Graphene

Pluggable datasources Formula language Source rate Desired rate Client Rate throttling: <= request <= the client can handle Caching or queuing Pluggable services Service registry Services (command/response) Datasources (publish/subscribe) Graphene

Status The framework around pvmanager has been growing in functionality – Originally started as small layer to handle caching/queueing – Now includes access to services (jdbc, pva, exec, web, …), extensible data manipulation language, graphs (graphene), … – Used outisde CS-Studio (embedded in Vasu’s Proteus, Xihui’s WebPDA) Increasing contributions – CosyLab contributing pva support, 3 UMich students working on graphene, ITER planning to contribute archived data integration, FRIB contributing array manipulation functions

Current problems Focus dominated by CS-Studio – Other use cases always left in the backburner (e.g. rebroadcast through pva, support for other languages, MATLAB) – No active effort into an overall strategy Where does one ask for support? – Unclear mailing list – pvmanager and graphene mailing list exist but they are not used. Updates on pvmanager were posted on cs-studio mailing list. Some reports are in CS-Studio github issues. Support for everything is private . Not well thought out process of contribution – Contributors often have to redo the CS-Studio integration themselves just to test their changes – Split into 3 repositories causes problems Trying to step back and reassess

Possible way forward Time to start to focus on this layer on its own merits! – Focus more equally on CS-Studio, v4, DISCS, web access and other uses; facilitate independent integration and contributions Create a Data Integration In Real Time (diirt) top project – pvmanager is not a good name for something that does services and other data sources – Single mailing list for all pvmanager/vtype/epics-util/graphene development and support – Single github issue tracker, will move all repositories to github More transparent release planning for diirt – Gather requirements from CS-Studio, v4 and DISCS more equally. A contact from each? Eric/Greg/Vasu Increase effectiveness of contributions – Branch of CS-Studio with development version Contributor can push changes in pvmanager repository, and get CS-Studio update for free – Independent framework for load/integration/normalization tests Contributor focuses his testing only up to diirt integration

Links to subprojects Summary of current issues Build status Links to github

Automatically committed from main pvmanager repo Download to latest dev build

For example diirt automatic integration Matej Kunal Updates pva services Adds a formula function Greg Eric Get cs-studio dev version and test

Future goal diirt automatic integration Matej, Kunal, UMICH students, ITER, … cs-studio dev web gateway dev pva gateway dev Contribute to the data integration framework: automatically contribute to a varied set of applications!

TESTING

Integration test framework We need a way to run integration tests – test IOCs -> CAJ -> pvmanager – Including disconnects due to power cycle and network downtime – Corner cases (e.g. different type at reconnect) – Ability to check server state (e.g. number of monitors open) – Ability to drop in and run the tests in production environment (to check specific versions of EPICS and network configurations) Start a script on the server side, start a script on the client side, come back in 15 minutes

public final void run() throws Exception { // Start first IOC init("typeChange1"); // Connect pv addReader(PVManager.read(channel("double-to-i32")), TimeDuration.ofHertz(50)); pause(1000); // Stop current IOC and start the second restart("typeChange2"); pause(2000); } public final void verify(Log log) { // Expect 3 connection notifications: connect/disconnect/connect log.matchConnections("double-to-i32", true, false, true); // Expect 3 value notifications. Match value/alarm, but not the timestamp log.matchValues("double-to-i32", ALL_EXCEPT_TIME, newVDouble(0.0, newAlarm(AlarmSeverity.INVALID, "UDF_ALARM"), newTime(Timestamp.of( , 0), null, false), displayNone()), newVDouble(0.0, newAlarm(AlarmSeverity.UNDEFINED, "Disconnected"), newTime(Timestamp.of( , 0), null, false), displayNone()), newVInt(0, newAlarm(AlarmSeverity.INVALID, "UDF_ALARM"), newTime(Timestamp.of( , 0), null, false), displayNone())); }

Integration test framework Covered – Simple reboot: connect pv, ioc down, ioc up, only 1 monitor open – Simple network outage: connect, network down, network up, only 1 monitor open – Multiple reboots: connect pv, ioc cycle 10 times – Type change: connect double pv, ioc cycle, pv become integer – Constant pv: conect to double/int/string/enum that do not change – Slow changing pv: conect to double updating at 1 Hz (same rate received) – Fast changing pv: conect to double updating at 100 Hz (reduced rate received) – Alarm changing pv: conect to double updating at 1 Hz for alarm only – Write pv: change value for double/int Not yet covered – Add all remaining types for disconnection test – Add all types for type change – Add all types for slow changing pvs – Add all types for fast changing pvs – Add all types for alarm changing pvs – Add all types for write pvs – Add metadata changes – Add access control changes – Add multiple reader on a single pv (only 1 monitor open) – Add nanosec out of range for time – Old RTYP handling

Integration test framework Martin Konrad took what I had and modified for his own purposes – Testing archiver, ca gateway, … – We diverged a little bit: need to re-align Andrej Babic is tasked to improve testing for CS-Studio My hope is that these efforts can be aligned into one – FLINT: framework for load/integration/normalization tests Load: large number of pvs, large data size Integration: different IOC versions, different OS, different network configurations Normalization: data comes out looking the same, predictable event count and order – Work with multiple protocols (pva and ca at least)

Integration test framework Still not enough: testing on my machine does not guarantee anything – We have found that performance profile changes significantly from machine to machine (even with same OS) Active scanning does not produce noticeable load on my machine on any os (Win/SL/Debian) while it does on other – We have found that some Java network libraries behave not up to spec on different machine Currently, I have to disable my wireless every time I run tests with ca or pva We need: – Set of automated tools that run tests and gather performance data – Set of people that actually run those tests pre-emptively on their HW/OS combination

OVERALL ARCHITECTURE

pvmanager core pva file ca Producer/Consumer Processing Visualization vTypes Data Definition Aggregation formula graphene Data integration in real time (Diirt) architecture CS-Studiopva gatewayweb gatewayMATLABPython http pva exec jdbc Command/response cf arch-app Historic data ch-arch Data serialization/conversion csv png/jpg NTTypes HDF5 datasource supportservices support …… archiver support … Client toolsServers

DIIRT CONSUMERS

CS-Studio pvmanager now serves data to most applications in CS-Studio – Allows formulas in BOY – Allows access to services in BOY Set of common features for pvmanager configuration in eclipse product Graphene widget continued improvements

CS-Studio Areas of possible improvements in CS-Studio – Better integration in BOY (e.g. allow color widget to be a formula) – Take advantage of pause/resume – Single point to configure color/fonts/time format/… for all applications Areas of possible improvements in diirt – Performance optimization (passive scanning) – Improve graphene widgets

python pvmanager and other diirt components are currently implemented in Java and there is a desire to access the functionality in python Different options: – Jython – Re-implement the whole stack in Python – Re-implement the whole stack in C++ – Implement a thin library in python that uses an embedded JVM – Others?

python – Jython No starter No access to all the library that make people want to use python in the first place

python – python implementation Benefits – Have all the functionality in pure python – Portability across python platforms Costs/challenges – Lack of well-spec’d memory and threading model? – Keeping Java and python implementations in line – Solves python, does not solve anything else

python – C++ implementation Benefits – Have all the functionality in C++, which makes it possible to support other languages (such as python) Costs/challenges – Threading and memory support OS dependent? C++11 may solve this – Probably can’t rely on immutable objects as in Java implementation – Significantly harder to write and support: very few can do this – Keeping Java and C++ implementations in line

python – embedded JVM Benefits – Leverages all the work already done – Versions keep in synch – Probably the cheapest Costs/challenges – Is it actually doable, with reasonable performance? Different techniques to investigate (JNI, Py4J, javacpp, …) – Can we keep the interface reasonable? Maybe we can start with this, and re- implement later? – Contributions to the core are limited to Java (but maybe this is not so bad) Py4j diirt CAPVA Python apps

MATLAB I’d be interested to at least write a simple example of use in Matlab – At least to know if it’s possible and how easy it is Is this something v4 or others are interested in? My current showstoppers: – I don’t have MATLAB and it costs few k$ – I don’t really know how to use MATLAB – I don’t really know what to do with it (need use cases) MATLAB diirt CAPVA

Web gateway using web sockets Desire to have web tools on top of CA Web sockets now allow updates from servers Couple of projects are already starting to use pvmanager within a web server Should have general purpose web gateway, powered by diirt and websockets? – Standardize JSON serialization of vTypes? XML (InfoSet, Fast InfoSet, EXI)? – Maybe a thin javascript client library as well? Re- implement WebCA? Binding for widgets (d3js)? – Should this support services as well? Provides access of pva services to web clients Web server diirt CAPVA Web apps Mobile apps

Pva gateway on top of pvmanager We have talked about this forever! Create a pva server that re- broadcasts what gets from pvmanager – Protocol translation (ca -> pva) – Data aggregation and processing – Could also work for services (web services -> pva) ChannelFinder could be a good candidate pva server diirt CA Web services pva clients pva services

DIIRT PRODUCERS AND INTEGRATION

Historic data Desire to integrate archived data and live data to provide a single client resident cache for real-time processing – Graphs in time – Computation (histrograms, correlation, fits) – Data export (to csv, HDF5, …) ITER is looking into contributing in this space diirt CA Archived data pva gateway CS-Studio

Functions Better support for disconnect/alarm/time Adding element-wise array operations – arraySum(VNumberArray, Vnumber) – arrayDiv(VNumberArray, VNumberArray) Adding experimental DFT

Support for nd-arrays VNumberArray is already n-dimensional Was extended to include cell boundaries Boundaries used by histogram plot Formula function that constructs the full object from a histogram record

Support for nd-arrays Students working on intensity graph Hopefully will be ready by end of May – Support for non-homogenous cells as well

v4 integration Cosylab has been contributing pva support for – Datasources (already integrated) – Services (in late-March release) Other things we already stated we need – Extract the VTypes NTTypes conversion/wrapping library

testChannel uri:ev4:nt/2012/pwd:NTXY v4 service example

testChannel uri:ev4:nt/2012/pwd:NTXY v4 service example Best part: I have no idea how it works! Cosylab supports it and CS-Studio picks it up.

DISCS integration Proteus is already embedding pvmanager for ca data access – Could benefit from standard web gateway We should more actively try DISCS service integration through diirt (and CS-Studio) – Maybe standard web service bindings to JSON, so we can (like JDBC and PVA) only require to write an XML file?

diirt service interface public abstract class ServiceMethod { public String getName(); public String getDescription(); public Map > getArgumentTypes(); public Map getArgumentDescriptions(); public final Map > getResultTypes(); public Map getResultDescriptions(); public abstract void executeMethod( Map arguments, WriteFunction > callback, WriteFunction errorCallback); }

Goals for diirt services Not general purpose! Some services in some cases – Have parameters that can be expressed by vTypes (e.g. VString and VNumbers) – Can return data that can be expressed in a vTable, vNumericArray, … – These can be digested by general purpose clients For example, channel finder – Will need a specific UI to manage tags and properties – Can export into a table the result of a query For the general purpose only we use diirt services

SCOPE

Where to draw the line? What do I use for creating a Java UI client? diirt! What do I use to access services? diirt! What do I use to do protocol conversion? diirt! What should I use to toast my bread? diirt! What should I give my spouse for Christmas? diirt!

Where to draw the line? All data can be accessed in diirt != all data access should use diirt I am fully committed to the first, I am really skeptical about the second – I don’t know all possible use cases – Software design is about tradeoffs – Waste of resources

Where to draw the line? For services, I think it’s already clear: only calls that can be used by “general purpose clients” – Is the client specific to your service? (e.g. log entry for logbook, property management for channelfinder) – Are VTypes unreasonable representations for request and response? – Is the data useful only in the context of its service? Services integrated in diirt should be the partial “view” that is “general purpose”

Where to draw the line? For datasources (CA and PVA specifically) Some clients that will never use diirt because they need direct access to the protocol API: – Debugging tools – IOC Some clients should use something like diirt because they are just going to re-implement it: – UI tools – General purpose clients – Web socket gateway Some things are in between, and I think it depends on different trade-offs – PVA gateway Do you want to faithfully forward the precise semantic of the pva protocol? Do you want make available data from other sources and do aggregation?

Where to draw the line? There is a problem of architecture decency: – An interface can’t be both general purpose and give access to everything protocol specific If you expose all the options of all the possible protocol implementations… – Current: ca:// pva:// file:// Formulas - e.g. =`pvArray1` + columnOf(‘pvTable’, “column2”) – Planned: jmx:// – Talked about: jms:// snmp:// I think we just end up with an unusable mess – Each new feature needs to be somehow understood by all clients and all datasource implementation

Where to draw the line? There is a problem of resource allocation: – We already have a CA and PVA library and multiple people working on them – Suppose we implement CA GET CALLBACK in pvmanager: what is it that we can then do that we couldn’t before? What is the cost to all clients and all data providers that now has to support it? – Doing things through diirt by itself is not a value

Where to draw the line? I think “Data integration in real time” identifies the scope pretty nicely: it has value if it’s a new way to get/combine/publish data – Are you creating an application that is protocol or service specific? – Do you need to access functionality (modes of communications and/or data types) that depends on a specific service or protocol? – Is your application running in its own process (or with others in a framework like Eclipse)? Is anybody going to configure which datasources/services it access? The more you answer yes, the more you should consider going to the original API directly

Where to draw the line? We can still “kludge” in some things – 99% is no, but there is that _one_ thing E.g. CA PUT CALLBACK is supported by adding some JSON after the channel name Could support file locks as JSON after the filename But it will never be like native protocol support We can’t sacrifice the general purpose client – Transparent access to different datasources, services, file format, … because that’s the whole point of diirt!