Presentation on theme: "Statistical Data Exchange Platform"— Presentation transcript:
1Statistical Data Exchange Platform MarketMap Analytic Platform
2Agenda Introduction to MarketMap Analytic Platform Sample Statistical Data Exchange PlatformEconomic Data ManagerSDMX Driven Roadmap
3MarketMap Analytic Platform USERSMarketMap AnalyticLanguageCAvailable 3rd Party InterfacesOut-of-the-box development and user interfacesWeb ServicesData LoadersForecasting, Analysis & Modeling EnvironmentFast DB for time series data storageBuilt in Analytical platformTime intelligenceAPPLICATIONSWebReportsOnsite ServerSQL AccessManaged Data ServicesPathfinder Cross SymbologyMarketMap Analytic Platform
4Key, Value Pair Time Series Data Storage Vector object database with coupled analytical engineStore, retrieve, and manipulate large numbers of rapidly accessible factsibm.closesp500.TotalReturnPCT(s sales)Apply a structured programming language geared towards manipulation of vector objectsBtreeThe MarketMap Analytic Database is specifically designed for storage of time series data and treats each time series as a separate object. These time series objects have a set of meta data attributes that are used to describe the objects and provide unique information that is used in the manipulation of time series.The database is designed for optimal performance when working with time series data –whether correlating and regressing one issue's pricing series against another or SLICING through multiple issue's prices and results as of a point in time.The database consists of two parts, the database header and the database body. The database header is a B*-tree that is used to store the unique identifier for each object, the object name, as well as the meta data. In the body of the database are the observations themselves and also the attributes that are associated with the object and this is always stored contiguously on the disk.A coupled analytic engine supports a built in scripting language that allows you to perform very powerful analytical operations with very little code allowing you to do extremely rapid development as well as advanced ad-hoc analysis and prototyping.Data
5Analytic layer with an embedded concept of time Regular frequencies: events recorded millisecondly, daily, monthly, etcPattern frequencies: specialized, but regular (e.g. market hrs)Aperiodic: for event-driven data capture (trades, corporate actions)Able to automatically convert frequenciesNot all data must be physically stored on disc. Some objects can actually be virtual formulas that are evaluated at run-time based on an expression.Whether data is stored on disc as a hard physical object or it’s a virtual soft objects, all objects can take advantage of the built in time series analytics in the database. Embedded in the system is an inherent notion of time that makes it easy to take data of disparate periodicities and frequencies and line them up on a consistent basis.For example, the evaluation of the global formula PE could taking prices from a business database and divide those prices by EPS numbers from a quarterly or annual database. This ratio combining daily and annual could then be iterated over time on a monthly basis. This automatic frequency conversion comes "for free" with our ADBMS.GLOBAL FORMULASBusinessMonthlyQuarterlyReturnAutomatic Frequency ConversionsMonthlyMonthlyMonthlyMonthly
6Multiple Analytic Databases house all required data This MarketMap Analytic database persists historical market data objectsOne can think our database as pure containers of independent objects. As discussed previously, each object is in turn a b-tree vector.In fact, the MarketMap Analytic Database can be thought of as a container of independent objects. Some objects are very simple – a scalar object representing the name of an issue. Others can be fatter, for example storing all prices for an issue since it went public.It's easy to combine, multiple, divide, correlate and regress one object against another. One can also open multiple containers on the server and let MAP locate the appropriate object based on its name – whether the US consumer price index values in an object named USA.CPI or IBM's daily trading volume values in an object named IBM.VOLUME.This MarketMap Analytic database persists historical macro economic objects
7MarketMap Economic Data Manager Extract, transform and load data stored in Excel workbooksDual database systemTime series data stored in FAME databasesMeta data about these series stored in SQL containerWeb Access layer used to report and graph time series data and also display descriptive data about the series
8Economic Data Collected in Excel Workbooks SunGard recently worked with a central bank where economic data was stored and analyzed by regional offices throughout the country in workbooks very similar to this one, which tracks consumer price index related data for the months of December and January.Data tended to be “locked” and “isolated” rather than centralized. In addition, it was challenging to distribute this valuable data outside of Excel and via a Web portal.This customer also asked us SunGard to design an approach for storing the entity, item, time and other dimensions of economic queries with an SDMX-inspired metadata approach.
9Common "access point" to Time Series & Metadata Relational DatabaseClient ApplicationsClient ApplicationsMAP Web Access ServerMAP DatabaseHTTP RequestWeb Service(WSDL)Many of the reporting features seen on this demo and in other economic reporting portals comes "out of the box" with a subscription to the MarketMap Analytic PlatformFor example, the MarketMap accessPoint layer – a servlet based middleware component – can accept incoming queries from a web browser and then assemble return results consisting of both relational data and time series dataService RequestRemote MAP Web Access ServerWeb BrowserBusiness ProcessDownstreamOutput ProvidersHTML / CSV / XMLTime IQ / Result SetProprietary Database
11Sample statistical production process Statistics Production EnvironmentpublicationsCollectCompileDisseminatewebESCB-Net/EXDIProduction system (FAME)SDWESCB-netSDMX Data Model
12Data structures Statistical data can be grouped together at the observation level (the measurement of some phenomenon);the series level (the measurement of some phenomenon over time, usually at regular intervals);the group level (a group of series – a well-known example being the sibling group, a set of series which are identical, except for the fact that they are measured with different frequencies); andthe dataset level (made up of several groups, to cover a specific statistical domain for instance).Dimensions are grouped into keys, which allow the identification of a particular set of data (a series, for example).Key values are attached at the series level and given in a fixed sequence. Partial keys can be attached to groups.Key values are attached at the series level and given in a fixed sequence. Conventionally, frequency is the first descriptor concept and the other concepts are assigned an order for that particular dataset. Partial keys can be attached to groups
13Example: Monetary aggregate M3 BSI.M.U2.N.V.M30.X.I.U Z01.ABSI = Key Family,M = Monthly series,U2 = euro area aggregate,N = Non-adjusted,V = MFIs + Govt,M30 = M3,X = Unspecified maturity,I = Index,U2 = residence of counterpart is euro area,2300 = other residents sector,Z01 = denominated in all currencies,A = Annual growth/change
14Use of a (SDMX-based) structured data format In the exchange, storage and dissemination of all statistical data and associated metadataIn the internal system and the communication with partner institutions and the general public (web services, SDMX-ML based extractions from the web site)Covering most domains of economic statistics (e.g. monetary and financial statistics, balance of payments, price indices, short-term statistics, real sector, government finance statistics, securities, etc.)
15Browsing for data series Organize object names and categories based on the SDMX standard
16Demo: Key family search In addition to loading Excel workbooks and creating Web-friendly reports, this custom solution allows economists to SEARCH for objects and build ad hoc queries that also incorporate analytical functions from FAME.Let’s switch over to the search option to find some data of interest.This particular customer asked SunGard to organize object names based a specific standards called SDMX which we will review later in the presentation. Essentially, it is a standard for organizing economic content to ease the sharing of data among central banks.
17Demo: Context sensitive selection boxes As an SDMX-inspired interface, the first task in finding data is to identify its KEY FAMILY and CHAPTER – both SDMX terms.Note that the dropdown boxes dynamically change as you change your selections. The SDMX-inspired categories also expand based on the choices made.Let’s click the SEARCH button to find some series.
18Demo: Report on the January 2006 data Let’s switch to the PRICES menu and click on the REVIEW navigation bar on the left. Given our working example, let’s select the CPI prices category.Of course, we should work with an appropriate date range. Let’s cover a timeframe of January 2004 to January 2006.When I click REFRESH, the Fame objects are queried and the data sets are returned as XML. The XML is then styled into a table report that is consistent with this institution’s look and feel.Note the use of SDMX-inspired dimension names to name the objects. Also note that this table can be quickly exported to Excel.
19Demo: Detailed CPI search results As you can see, this search found 483 CPI items.Let’s go back to our working date range and then view some details about particular consumer price index related data by clicking on the edit link.
20Demo: Select objects of interest Now let’s add a few items to a basket so that we can begin analyzing a group of indicators. Once again, let’s establish an appropriate working date range.The basket feature is similar to the shopping cart found in many online shopping sites, like Amazon.com. It allows the user to combine results from different searches for analysis at a later time.We can click on the VIEW button to work with these individual series.
21View objects in monthly frequency As you can see, the data for the three series are displayed based on our date range of January 2004 to January 2006.
22Demo: View selected objects in annual frequency This data is monthly. However, using Fame’s built-in time intelligence, we can easily change to quarterly and annual views of data.
23Demo: Multiple ways of aggregating data We can also change the observed attributes – for example, whether data should be summed when analyzed over time averaged over time.Using the SUMMED aggregation method yields large values as the data within the specified frequency is summed.Using the AVERAGED method yields smaller values as the data is averaged within the specified frequency.Fame’s built-in time intelligence provides fast and easy to use manipulation of economic data – no matter what the natural frequency or format of the data.
24Demo: PCTLet’s reset the values back to monthly with the original observed attributes.The Fame database contains an entire library of analytical functions. This particular page displays a subset of those functions, including moving average and annual percent change.Notice that when we click on PCT or percentage change, another column is added to the report that displays the calculation.In addition to downloading data to Excel, the Fame Web Access Layer can also load the Fame 4GL graphing engine to help visualize data.Let’s close out of this report VIEW and GRAPH the data.
25Demo: Graph selected objects The Fame 4GL graphing engine is loaded on the web server making it possible to generate quick graphical images like this one analyzing three series over time.This was a brief overview of a solution leveraging the Fame Solution Stack and the expertise of Fame Professional Services. The Fame Solution Stack provides a means to:Parse and load data stored in Excel into a central warehousePerform in database analytics on the data within the serverDistribute and report the data via the Web
26Demo: Administer user entitlements In a centralized warehouse, it is important that supervisors be able to approve data before it is published. Click on the Administration tab to view maintenance of users with the Fame Web Access entitlement layer.After a workbook is parsed, it is loaded into a holding area where appropriate supervisors can approve the publishing of the Fame-based data to the entire enterprise.With the administration page, managers can grant access to individual users as well as individual data series. It also provides a facility to approve series that have been loaded into the system.With data now loaded and approved for distribution, we can query the Fame database.
28Central Bank Use CaseUsing FAME since 1993 to not only store time series but also to produce statistics from primary to derived information, to disseminate statistical information, internally and on the internet, in different ways (statics, and dynamically) and formats, and, to exchange statistical information with national and international organizations using the SDMX standard.Core toolsHundreds of functions and procedures using Fame/4GL, TimeIQ and the C-HLIWorking with FAME 9.3 and migrating to FAME 10.1
29High priority database improvements Integration of FAME vectors, formulas, functions and procedures with the SDMX information model and SDMX ML formatSpecific objects to manage data structural definitions (DSD) informationReferential integrity between series names and the DSDCreation of a matrix object to deal with observation-level attributesNew second sub-index referencing the attribute so that attribute series values and their observation level attributes can be stored togetherConcurrent database updatingCurrent update mechanism locks database at the database level which prevents users from updating data at the same level concurrently
30High priority end user improvements Friendly menu-driven environment for end users & developersTake a page from SAS, eViews and others who provide a menu driven experience for building statistical studiesTake a page from Eclipse or Visual Studio .NET who provide an integrated development environment complete with syntax wizards, auto completion facilities and debuggersAttribute searching facilitiesWildcard searching on the information stored in attributes that perform with the same level of efficiency as the searching facilities for series name and series alias
31Statistical Data & Metadata eXchange SDMX is not just tagging of data with XML. Its intent is to be a standard for data interrogation (SDMX Registry) and common format for publishing (SDMX XLM output) . SDMX Data – Query Response provider in Web AccessIncluded a POC SDMX registry.The Metadata for a few Forex based times series using SDMX structures were placed in this registry by hand. The Registry was implemented in MySQL.The SDMX Custom Service provider implemented the SDMX V1.0 standard.It included SDMX supported query response. The idea here is you hit the MYSQL that has the SDMX Registry to discover what data is available at that site.Create a second call that had the SDMX Data information. That second call for the data was in SDMX xml form and the CSP received that request, looked at the SDMX Registry which had a hard mapping to the Forex time series.The response was sent back using SDMX xml V1.0 format.