Presentation is loading. Please wait.

Presentation is loading. Please wait.

All Contents © 2006 Burton Group. All rights reserved. Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things Peter OKelly Research.

Similar presentations

Presentation on theme: "All Contents © 2006 Burton Group. All rights reserved. Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things Peter OKelly Research."— Presentation transcript:

1 All Contents © 2006 Burton Group. All rights reserved. Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things Peter OKelly Research Director Thursday – November 30, 2006

2 2 Data Modeling is Underrated Agenda Synopsis ~7-minute summary Discussion Extended-play overview (for reference) Analysis Market snapshot Market trends Market impact Recommendations

3 3 Data Modeling is Underrated Synopsis Data modeling used to be seen primarily as part of database analysis and design -- for DBMS-nerds only There is now growing appreciation for the value of logical data modeling in many domains, both technical and non-technical Historically, most data modeling techniques and tools have been inadequate, and often focused more on implementation details than logical analysis and design Pervasive use of XML and broader exploitation of metadata, along with improved techniques and tools, is making data modeling more useful for all information workers (as well as data-nerds) Data modeling is a critical success factor for XML – in SOA and elsewhere Data modeling is now A fundamental part of the back-to-basics trend in application development Key to effective exploitation of emerging applications and tools Essential to regulatory compliance (e.g., information disclosure tracking)

4 4 Data Modeling is Underrated ~7-minute summary Logical data modeling is often misunderstood and underrated Models of real-world things (entities), attributes, relationships, and identifiers Logical => technology-independent (not implementation models) Logical data modeling is not 1:1 with relational database design Its as much about building contextual consensus among people as it is capturing model design for software systems Its also exceptionally useful for database design, however Some of the historical issues Costly, complex, and cumbersome tools/techniques Disproportionate focus on physical database design

5 5 Data Modeling is Underrated ~7-minute summary Logical data modeling is more relevant than ever before Entities, attributes, relationships, and identifiers None of the above are optional if you seek to Respect and accommodate real-world complexity Establish robust, shared context with other people Revenge of the DBMS nerds Not just for normalized number-crunching anymore… Native DBMS XML data model management => fundamental changes XQuery: relational calculus for XML SQL and XQuery have very strong synergy All of the capabilities that made DBMS useful in the first place apply to XML as well as traditional database models DBMS price/performance and other equations have radically improved Logical modeling tools/techniques are more powerful and intuitive And less expensive

6 6 Data Modeling is Underrated ~7-minute summary XML-based models are useful but insufficient Document-centric meta-meta-models are not substitutes for techniques based on entities, attributes, relationships, and identifiers Some XML-centric techniques have a lot in common with pre-relational data model types (hierarchical and network navigation) or mutant object database models XML also unfortunately has ambiguous aspects like the unfortunate Entity- Relationship (E-R) model Logical data modeling is not ideal for document-oriented scenarios (involving narrative, hierarchy, and sequence; optimized for human comprehension) But a very large percentage of XML today is data-centric rather than document- centric And increasingly pervasive beyond-the-basics hypertext (with compound and interactive document models) is often more data- than document-centric

7 7 Data Modeling is Underrated ~7-minute summary Ontology is necessary but insufficient Categorization is obviously a useful organizing construct Folksonomies are also often very effective But… Categorization is just one facet of modeling Many related techniques are conducive to insufficient model detail, creating ambiguity and unnecessary complexity, e.g., for model mapping So… Were now seeing microformats and other new words … that are fundamentally focused on logical data model concepts Itd be a lot simpler and more effective to start with logical data models in the first place

8 8 Data Modeling is Underrated Discussion

9 9 [Extended-play version] Analysis Market snapshot Data modeling concepts Data modeling benefits Data modeling in the broader analysis/design landscape Why data modeling hasnt been used more pervasively

10 10 Market Snapshot Data modeling concepts: the joy of sets Core concepts Entity: a type of real-world thing of interest Anything about which we wish to capture descriptions More precisely, an entity is an arbitrarily defined but mutually agreed upon classification of things in the real world Examples: customer, report, reservation, purchase Attribute: a descriptor (characteristic) of an entity A customer entity, for example, is likely to have attributes including customer name, address, … Relationship: a bidirectional connection between two entities Composed of two links, each with a link label/descriptor Example: customer has dialogue; dialogue of customer Identifier: one or more descriptors (attributes and/or relationship links) that together uniquely identify entity instances, e.g., CustomerID

11 11 Market Snapshot Data modeling concepts: example data model fragment diagram Following Carlis/Maguire (from their data modeling book): About each customer, we can remember its name, industry, address, renewal data, and ID. Each customer is identified by its ID. About each dialogue, we can remember its customer and its date, topic, and analyst. Each dialogue is identified by its customer and its date. Entities Attributes Relationship Identifiers [Note: this model fragment is an example and is not very well-formed]

12 12 Market Snapshot Data modeling concepts: example data model instance Customer CustomerID (PK1) CustomerNameCustomerIndustryCustomerAddressCustomerRenewalDate 017823 75912 91641 Acme Widgets Degrees 4U Manufacturing Financial services Education 123 Main Street… 456 Central… P.O. Box 1642… 2005/10/14 2006/05/28 2004/12/31 Dialogue CustomerID (PK1, FK1) DialogueDate (PK1) DialogueTopicDialogueAnalyst 75912 91641 017823 2005/06/18 2003/12/13 2004/10/14 Data architecture SIP/SIMPLE Portal Peter OKelly Mike Gotta Craig Roth PKn: participates in primary key FKn: participates in foreign key Bonus: its very simple to create instance models (and thus relational database designs) from well-formed logical data models

13 13 Market Snapshot Data modeling benefits Precision and consistency High fidelity models Which are easier to maintain in order to reflect real-world changes Improved Ability to analyze, visualize, communicate, collaborate, and build consensus Potential for data reuse A fundamental DBMS goal Easier to recognize common shapes and patterns Impact analysis (e.g., what if assessments for proposed changes) Exploitation of tools, servers, and services DBMSs and modern design tools/services assume well-formed data models Being normal is not enough… SOA, defined in terms of schemas, requires data model consensus

14 14 Market Snapshot Data modeling in the broader analysis/design landscape Four dimensions to consider Data, process, and events Roles/concerns/views: strategic, operational, and technology Logical and physical Current/as-is and goal/to-be states

15 15 Market Snapshot Data, process, and events Think of nouns, verbs, and state transitions Data: describes structure and state at a given point in time Process: algorithm for accomplishing work and state changes Event: trigger for data change and/or other action execution Integrated models are critically important Data modeling, for example, is guided by process and event analyses Otherwise scope creep is likely There is no clear right/wrong in data modeling Scope and detail are determined by the processes and events you wish to support, and they often change over time

16 16 Market Snapshot

17 17 Market Snapshot Roles/concerns/views Three key dimensions Strategic Organization mission, vision, goals, and strategy Operational Data, process, and event details to support the strategic view Technology Systems (applications, databases, and services) to execute operations Again pivotal to maintain integrated models Data modeling thats not guided by higher-level goal modeling can suffer from scope creep and become an academic exercise

18 18 Market Snapshot Logical and physical Another take on operational/technology Logical: technology-independent data, process, and event models Examples: Entity-Relationship (ER) diagram Data flow diagram (process model) Physical: logical models defined in software (Doesnt imply illogical… ) Examples Data definition language statements for database definition, including details such as indexing and table space management for performance and fault tolerance Class and program modules in a specific programming language Integration and alignment between logical and physical are key But are often far from ideal, in practice today

19 19 Market Snapshot Current/as-is and goal/to-be states Combining as-is/to-be states and logical/physical Logical Physical Technology- independent view of current systems Systems already in place; the stuff we need to live with… Real-world model unconstrained by current systems New system view with high-fidelity mapping to logical goal state Goal state/to-beCurrent/as-is

20 20 Market Snapshot Why data modeling hasnt been used more pervasively So, why isnt everybody doing this?... Data modeling is hard work Historically Disproportionate focus on physical modeling Inadequate techniques and tools Suboptimal burden of knowledge distribution Reduced green field application development Data modeling has a mixed reputation

21 21 Market Snapshot Data modeling is hard work Its straightforward to read well-formed data models, but its often very difficult to create them Key challenges Capturing and accommodating real-world complexity Dealing with existing applications and systems Organizational issues Collaboration and consensus-building Role definitions and incentive systems that discourage designing for reuse and working with other project teams Politics

22 22 Market Snapshot Historically disproportionate focus on physical modeling Radical IT economic model shifts during recent years Design used to be optimized for scarce computing resources including MIPs, disk space, and network bandwidth The Y2K crisis is a classic example of the consequences of placing too much emphasis on physical modeling-related constraints Relatively stand-alone systems discouraged designing for reuse Now Applications are increasingly integrated, e.g., SOA Hardware and networking resources are abundant and inexpensive The ability to flexibly accommodate real-world changes is mission- critical Logical modeling is more important than ever before

23 23 Market Snapshot Historically inadequate techniques and tools Tendency to focus on physical, often product-specific (e.g., PeopleSoft or SAP) models Lack of robust repository offerings Making it very difficult to discover, explore, and share/reuse models Entity-Relationship (ER) model More of an ambiguous and incomplete diagramming technique, but still the de facto standard for data modeling

24 24 Market Snapshot Tangent: ER, whats the matter? Entity Relationship deficiencies Per E. F. Codd [1990] Only the structural aspects were described; neither the operators upon those structures nor the integrity constraints were discussed. Therefore, it was not a data model The distinction between entities and relationships was not, and is still not, precisely defined. Consequently, one persons entity is another persons relationship. Even if this distinction had been precisely defined, it would have added complexity without adding power. Source: Codd, The Relational Model for Database Management, Version 2

25 25 Market Snapshot Tangent: ER, whats the matter? Many vendors have addressed some original ER limitations, but the fact that ER is ambiguous and incomplete has led to considerable problems The Logical Data Structure (LDS) technique is much more consistent and concise, but its only supported by one tool vendor (Grandite) Its possible to use the ER-based features in many tools in an LDS- centric approach, however Ultimately, diagramming techniques are simply views atop an underlying meta-meta model The most useful tools now include Well designed and integrated meta-meta models Options for multiple view types, including data, process, and event logical views, as well as assorted physical views

26 26 Market Snapshot Historically inadequate techniques and tools Unfortunate detours such as overzealous object-oriented analysis and design Class modeling is not a substitute for data modeling Everything is an object and system-assigned identifiers often mean insufficient specificity and endless refactoring Fine to capture entity behaviors and to highlight generalization, but you still need to be rigorous about entities, attributes, relationships, and identifiers No Dummies Guide to Logical Data Modeling E.g., normalization: a useful set of heuristics for assessing and fixing poorly-formed data models But there has been a shortage of useful resources for people who seek to develop data modeling skills – in order to create well-formed data models in the first place Result: often intimidating levels of complexity…

27 27 Market Snapshot Historically inadequate tools and techniques An Object Role Modeling (ORM) example Consistent and concise But also overwhelming Doesnt scale well for more complex modeling domains Useful for some designers But not as useful for collaborative modeling with subject matter experts who dont seek to master the technique Source:

28 28 Market Snapshot Historically suboptimal burden of knowledge distribution Following Carlis: knowledge is generally captured in three places Resource managers/systems such as DBMSs Applications/programs Peoples heads Universally-applicable data, process, and event details are ideally captured in DBMSs Applications can be circumvented and are often cruelly complex People come and go (and take their knowledge with them) But in recent years, DBMSs have been relegated to reduced roles Suboptimal in many data modeling-related respects Often meant inappropriate distribution of the burden of knowledge DBMSs (and thus data modeling) are now resurgent, however

29 29 Market Snapshot Reduced green field application development Following the enterprise shift toward purchased-and-customized applications such as ERP and CRM Start with models supplied by vendor Usually with major penalties for extensive customization So we often see enterprises changing their operations to match purchased applications instead of the other way around In many cases, packaged applications Follow least common denominator approaches in order to support multiple DBMS types Capture universally-applicable data/process/event model facets at the application tier instead of in DBMSs Far from ideal distribution of the burden of knowledge Trade off increased complexity for increased generality Good for application vendors; not always so good for customer organizations Overall, this has often resulted in Reduced incentives and utility for data modeling Many organizations deferring to application suppliers for data models, often with undesirable results such as lock-in and endless consulting

30 30 Market Snapshot Recap: data modeling has a mixed reputation Because of the historical challenges The return on data modeling time investment has been far from ideal because of Lack of best practices, techniques, and tools Environmental dimensions that reduced the utility of data modeling Many enterprise data modeling projects became IT full-employment acts With endless scope creep, unclear milestones, completion criteria, and return on investment As a result, enterprise data modeling endeavors have become scarcer during recent years, with the relentless IT focus on ROI and TCO Obviously an untenable situation Both IT people and information workers are increasingly making decisions when they literally dont know what theyre talking about, due to the lack of high quality and fidelity data models

31 31 Analysis Market trends Back to data basics Broader and deeper data modeling applicability Availability of more and better data models Simpler and more effective techniques and tools Increasing data modeling utility, requirements, and risks

32 32 Market Trends Back to data basics Growing appreciation for The reality that all bets are off if youre not confident you have established consensus about goals, nouns, verbs, and events Software development life cycle economic realities Its much more disruptive and expensive to correct models as you go through analysis, design, implementation, and maintenance phases Less expensive hardware and networking means the return on time investment for logical modeling is increasing while the return for physical modeling is decreasing Indeed, emerging model-driven tools increasingly make it possible for the logical model to serve as the application specification, with penalties for developers who insist on endlessly tweaking the generated physical models (code)

33 33 Market Trends Broader data and deeper modeling applicability SOA is one of the most significant data modeling-related development during recent years All about services, but with a deep data model prerequisite Don Box: services share schemas and contract, not classDon Box From a DBMS-centric world view, web services => pragmatic XML evolution Parameterized queries, as in DBMS stored procedures Structured and grouped query results SOA has also driven the need for web services repository (WSR) products Increasingly powerful tools for information workers have also expanded the applicability of data modeling An early example: Business Objects – focused on making data useful for more people through data model abstractions Similar capabilities are now available throughout products such as Microsoft Office Recent developments such as XQuery will dramatically advance the scope and power of applied set theory

34 34 Market Trends Availability of more and better models Resources such as books focused on the topic area, e.g., Carlis/Maguire and David Hays Data Model Patterns Products that include expansive data models, ranging from ERP to recent data model-focused offerings such as NCR Teradatas logical data model-based solutions Universal model resources from enterprise architecture tool vendors such as Visible Systems Based on decades of in-market enterprise modeling experience

35 35 Market Trends Availability of more and better models Standards groups and initiatives, such as ACCORD Open Application Group OASIS Universal Business Language Models developed by enterprises and government agencies, e.g., Canadas Integrated Justice Information (IJI) initiativeIntegrated Justice Information (IJI) initiative Provides a data model and context framework for all aspects of law enforcement No magic: a multi-year effort with pragmatic hard work and governance Similar initiatives are now under way in the United States and other countries

36 36 Market Trends Simpler and more effective techniques and tools Most now include Cleaner separation of concerns and more intuitive user experiences For data modeling: ER subsets/refinements that reduce ambiguity and notational complexity And support view preferences with variable levels of detail Integrated meta-meta models and unified repositories Supporting enterprise architecture models such as the Zachman Framework as navigational guides Although theres still a perplexing lack of repository-related standards

37 37 Market Trends Data modeling in the enterprise architecture landscape Relative to the Zachman Framework Source:

38 38 Market Trends Simpler and more effective techniques and tools Most now include (continued) Model-driven analysis and design tools Building on virtualization and application frameworks with declarative services for transactions, security, and more Even more incentive to focus more on logical models and less on physical models More powerful and robust forward- and reverse-engineering capabilities To transform physical => logical as well as logical => physical Many are also available at much lower cost And some open source modeling tools have emerged

39 39 Market Trends Increasing data modeling utility, requirements, and risks To recap: much more utility from effective data modeling Related trends and risks Regulatory compliance requirements, especially concerning information disclosure Impossible to track whats been disclosed (both by and to whom) if you dont know what youre managing and who has access to it Increasing demand for reverse-engineering tools in order to better understand existing systems and interactions Cognitive overreach – the potential for information workers to create nonsensical queries based on poorly-designed data models The queries will often execute and return arbitrary results With which people will make equally arbitrary business decisions

40 40 Analysis Market impact Pervasive data modeling and model-driven analysis/design Vendor consolidation and superplatform alignment Potentially disruptive market dynamics

41 41 Market Impact Pervasive data modeling and model-driven analysis/design No longer optional (never really was) Most of todays software products assume effective data modeling Using a DBMS or an abstraction layer such as Microsofts ADO.NET with poorly-designed data models results in significant penalties Often implicit, e.g., in Information worker-oriented tools such as the query and data manipulation tools included in Microsoft Office Not a recent development – e.g., consider > $1B annual market for products such as Apple Filemaker Pro and Microsoft Access – but rapidly expanding Future offerings such Microsoft Vista and Microsoft Office 2007, which are deeply data model- and schema-based For documents, messages, calendar entries, and more, all with extensible schemas and tools for direct information worker metadata manipulation actions

42 42 Market Impact Vendor consolidation and superplatform alignment A familiar pattern –commoditization, standardization, and consolidation, resulting in Significant merger/acquisition activity Shifting product categories, in this context including Specialized/focused modeling tools Including widely-used products such as Microsoft Visio Enterprise architecture/application lifecycle management tool suites Essentially CASE++, with more and better integrated tools, deeper standards support, and often with support for strategic views Examples: Borland, Embarcadero, Grandite, Telelogic, Visible Superplatform-aligned tool suites IBM, Microsoft, and Oracle, for example, all either now or plan to soon offer end-to- end model-driven tool suites IBM currently has a significant market lead, through its Rational acquisition Broader support for interoperability-focused standards initiatives such as XMI (OMGs XML Metadata Interchange specification)

43 43 Market Impact Vendor consolidation and superplatform alignment Some CASE and modeling tool vendor merger/acquisition activity SDP S-Designor PowerSoftSybase PowerDesigner PopkinTelelogic TogetherSoftBorland RationalIBM VisioMicrosoft

44 44 Market Impact Potentially disruptive market dynamics Opportunities for new or refocused entrants, e.g., Adobe: a potential leader in WSR following its acquisition of Yellow Dragon Software Adobe doesnt offer data modeling tools, but it has a broad suite of tools that exploit XML and data models The urgent need for WSR products could result in SOA-centric repository offerings expanding to encompass more traditional repository needs as well Altova: expanding into UML modeling from its XML mapping/modeling franchise Microsoft: Visual Studio Team System (VSTS) is Microsofts first direct foray in modeling tools It used to offer Visual Modeler, an little-used OEMd version of Rational Rose VSTS wont initially include data modeling tools, but they are part of the plan for future releases MySQL AB: acquired an open source data modeling tool (DBDesigner 4) and is preparing to reintroduce an expanded version (which will remain open source)

45 45 Market Impact Potentially disruptive market dynamics New challenges for UML, with significant implications UML is the most widely-used set of diagramming techniques today, but its not particularly useful for data modeling, and it has some ambiguities and limitations Microsoft and some other vendors believe domain-specific languages (DSLs) are more effective than UML for many needs If UML falters, vendors that have placed strategic bets on UML (such as Borland, IBM, and Oracle) will face major challenges Open source modeling initiatives Some examples Argo UML MySQLs future Workbench tool set MyEclipse: $29.95 annual subscription for multifaceted tools with modeling These initiatives will accelerate modeling tool commoditization and standardization

46 46 Market Impact The U in UML stands for unified, not universal UML is in some ways ambiguous and is not a substitute for data modeling Some tools include UML profiles for data modeling, however UML profiles are similar to domain specific languages in many respects Its not clear that UML is ideal for meta-meta-meta models UML represents unification of three leading diagramming techniques, but its not universally applicable UML is much better than not using any modeling/diagramming tools, but its not a panacea Although its getting more expressive and consistent, with UML v2

47 47 Analysis Recommendations Think and work in models Build and use model repositories Create high-fidelity modeling abstractions for SOA Revisit modeling tool vendor assumptions and alternatives Respect and accommodate inherent complexity

48 48 Recommendations Think and work in models Develop skills and experience in Thinking at the type level of abstraction Using set-oriented query tools/services Data modeling utility now extends far beyond database analysis and design Information workers who have effective data modeling skills will be much more productive Use data modeling to analyze, visualize, communicate, and collaborate Provide guidance in Data modeling training and tools Selecting appropriate tools Dont use ambiguous or incomplete diagramming techniques Making resources available in models

49 49 Recommendations Build and use model repositories Do not Needlessly recreate/reinvent models Default to exclusively extrapolating models from existing XML schemas or query results Reality check: thats how most XML-oriented modeling is done today, but it often propagates suboptimal designs and limits reuse This may seem familiar: it repeats an early DBMS pattern, when many developers simply moved eariler file designs into DBMSs rather than checking design assumptions/goals Ensure policies and incentive systems are in place to encourage and reward model sharing via repositories Add to data governance strategy

50 50 Recommendations Create high-fidelity modeling abstractions for SOA SOA is rapidly becoming a primary means of facilitating inter-application integration Robust SOA schema design entails abstraction layers Exposing public interfaces to private systems otherwise often means propagating suboptimal data model design decisions Sharing services with users whom you may never actually meet Making unambiguous and robust models more important than ever WSR is likely to become a key part of enterprise model repository strategy Encompassing contexts and models that arent exclusively SOA- focused

51 51 Recommendations Revisit modeling tool vendor assumptions and alternatives Think form-follows-function White board and pencil & paper often suffice for information worker contexts, and are generally more conducive to productive modeling sessions Enterprise architecture-related modeling, in contrast, should be done with integrated and repository-based tool suites Align with superplatform commitments, e.g., If IBM-focused, for instance, Rational is an obvious candidate Microsoft-focused customers need tactical plans until Microsoft delivers a more comprehensive VSTS Oracle customers should revisit Oracle Developer Suite 10g, which includes Oracle Designer Organizations using a mix of DBMSs can benefit from using tools from specialists such as Embarcadero, Telelogic, and Visible Systems Explore open source-related modeling initiatives And expect very rapid open source modeling initiative expansion/evolution

52 52 Recommendations Respect and accommodate inherent complexity Modeling is and will remain hard work Modeling is simpler and more effective when people can work with common techniques, tools, repositories, and collections of high-fidelity data models But the real world is increasingly complex and dynamic, and effective models must reflect those realities Politics and other inter-personal communication challenges are also not going away, especially in virtual organizations Neither over-simplify nor over-reach Suboptimal modeling and design decisions can cause much more damage in todays SOA-centric world Means sub-optimally shifting the burden of knowledge Information worker-oriented power tools mean the potential for cognitive overreach is rapidly rising for people who (directly or indirectly) work with ambiguous or otherwise poorly-designed models

53 53 Conclusion Data modeling is not just for databases anymore Data modeling is pivotal for analysis, visualization, communication, and collaboration Organizations that do incomplete or otherwise inadequate data modeling Will fail to fully exploit todays leading tools, servers, and services Will not be able to comply with regulatory compliance requirements, especially for information disclosure Data modeling is not easy but it has a very strong return on time investment Its not optional, so enterprises need to do it well The timing and tools have never been better

54 54 Resources Burton Group Content Business Process Modeling: Adding Value or Overhead? Data Modeling: Not Just for Databases Anymore XML Modeling and Mapping: Tumultuous Transformation in the Grand Schema Things Model-Driven Development: Rethinking the Development Process Related Resources John Carlis, Joseph Maguire. Mastering Data Modeling: A User-Driven Approach. Addison-Wesley, 2001. Jack Greenfield, Keith Short. Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools. Wiley, 2004. David C. Hay. Data Model Patterns: Conventions of Thought. Dorset House Publishing, 1995. Martin Fowler. UML Distilled (3 rd ed.). Addison-Wesley, 2004

55 55 Data Model Examples Basic wiki model

56 56 Data Model Examples Socialtext wiki model

Download ppt "All Contents © 2006 Burton Group. All rights reserved. Data Modeling is Underrated: A Bright Future Ahead in the Grand Schema Things Peter OKelly Research."

Similar presentations

Ads by Google