© 1998, Progress Software Corporation 1 Migration of a 4GL and Relational Database to Unicode Tex Texin International Product Manager.

Slides:



Advertisements
Similar presentations
Case Study: Examining the Results of P2P Collaboration at PricewaterhouseCoopers February 14, 2001 Case Study: Examining the Results of Collaboration at.
Advertisements

© Copyright 2007 Exempler Telecom Test Automation System Exempler - We pride ourselves with providing lightweight robust engineering solutions.
Polycom Unified Collaboration for IBM Lotus Sametime and IBM Lotus Notes January 2010.
C9: SOA Management with Actional® for Sonic™
Analysis of Computer Algorithms
Chapter 1: The Database Environment
The 4 T’s of Test Automation:
1 Building a Fast, Virtualized Data Plane with Programmable Hardware Bilal Anwer Nick Feamster.
1 of 18 Information Dissemination New Digital Opportunities IMARK Investing in Information for Development Information Dissemination New Digital Opportunities.
1 jNIK IT tool for electronic audit papers 17th meeting of the INTOSAI Working Group on IT Audit (WGITA) SAI POLAND (the Supreme Chamber of Control)
Introduction to Product Family Engineering. 11 Oct 2002 Ver 2.0 ©Copyright 2002 Vortex System Concepts 2 Product Family Engineering Overview Project Engineering.
ITCR Success through Innovation iTCR Success through Innovation CiTRs DECADE Strategy ä DECADE vision integrated electronic customer access.
Extending Eclipse CDT for Remote Target Debugging Thomas Fletcher Director, Automotive Engineering Services QNX Software Systems.
Beyond Text Representation Building on Unicode to Implement a Multilingual Text Analysis Framework Thomas Hampp – IBM Germany Content Management Development.
From UCS-2 to UTF-16 Discussion and practical example for the transition of a Unicode library from UCS-2 to UTF-16.
18 th International Unicode Conference Documentum Proprietary 1 18 th International Unicode Conference Documentum and UTF-8: Converting Content Management.
The creation of "Yaolan.com" A Site for Pre-natal and Parenting Education in Chinese by James Caldwell DAE Interactive Marketing a Web Connection Company.
Credit hours: 4 Contact hours: 50 (30 Theory, 20 Lab) Prerequisite: TB143 Introduction to Personal Computers.
1 Copyright © 2005, Oracle. All rights reserved. Introduction.
1 Copyright © 2005, Oracle. All rights reserved. Introducing the Java and Oracle Platforms.
0 - 0.
Addition Facts
Making the System Operational
Connecting to Databases. relational databases tables and relations accessed using SQL database -specific functionality –transaction processing commit.
Enterprise Java and Data Services Designing for Broadly Available Grid Data Access Services.
| Copyright © 2009 Juniper Networks, Inc. | 1 WX Client Rajoo Nagar PLM, WABU.
A View of the Business with Drillable Graphics Southern Computer Measurement Group May, 2012 Martha Hays.
Configuration management
Software change management
Configuration management
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Multi-Application in Smart Card-based Devices Christophe Colas, Chief Software Architect August 2002.
Chapter 10: Designing Databases
1 tRelational/DPS Overview. 2 ADABAS Data Transfer: business needs and issues tRelational & DPS Overview Summary Questions? Demo Agenda.
Proposal by CA Technologies, IBM, SAP, Vnomic
Upgrading the Oracle Applications: Going Beyond the Technical Upgrade Atlanta OAUG March 19, 1999 Robert Cooney.
ICS 434 Advanced Database Systems
Component-Based Software Engineering Main issues: assemble systems out of (reusable) components compatibility of components.
Pierre Nantel, Office of the CIO
Database System Concepts and Architecture
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 4 Slide 1 Software processes 2.
DB Relay An Introduction. INSPIRATION Database access is WAY TOO HARD The crux.
Electronic Filing System Proposal Texas Ethics Commission January 31, 2013 Texas Ethics Commission January 31, 2013.
Addition 1’s to 20.
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 15 Programming and Languages: Telling the Computer What to Do.
Week 1.
Building an EMS Database on a Company Intranet By: Nicholas Bollons Sally Goodman.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
Chapter 10 Application Development. Chapter Goals Describe the application development process and the role of methodologies, models and tools Compare.
Review1 What is multilingual computing? Bilingual, trilingual, vs. Multilingual What are the fundamental issues in multi-lingual computing? –Representation.
Copyrighted material John Tullis 8/13/2015 page 1 Blaze Software John Tullis DePaul Instructor
Alexey Miroshnikov © Copyright InfoStroy Ltd., 2013.
Object Oriented Databases by Adam Stevenson. Object Databases Became commercially popular in mid 1990’s Became commercially popular in mid 1990’s You.
Page  1 SaaS – BUSINESS MODEL Debmalya Khan DEBMALYA KHAN.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Overview of SQL Server Alka Arora.
UNICODE Character Sets and Coding Standards Han Unification and ISO10646 Encoding Evolution and Unicode Programming Unicode.
ASCII and Unicode.
SAP Overview SAP? Company ERP Software package –R/2 –R/3.
21 st International Unicode Conference Dublin, Ireland, May Folded Trie: Efficient Data Structure for All of Unicode Vladimir Weinstein
Technical Aspects of SIARD “SIARD under the hood” 10. April 2003 / Stephan Heuscher.
FBD Associates Inc. ENABLING THE FUTURE Natural / Adabas Migration Solutions.
MISSION CRITICAL COMPUTING SQL Server Special Considerations.
Text TCS INTERNAL Oracle PL/SQL – Introduction. TCS INTERNAL PL SQL Introduction PLSQL means Procedural Language extension of SQL. PLSQL is a database.
MNCS: DMA Extensions for Multinational Character Strings DMA Technical Committee Integration Subcommittee June 16, 1999 [ notes]
SAP Overview.
Design and Maintenance of Web Applications in J2EE
Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.
Presents: Rally To Java Conversion Suite
DBOS DecisionBrain Optimization Server
Presentation transcript:

© 1998, Progress Software Corporation 1 Migration of a 4GL and Relational Database to Unicode Tex Texin International Product Manager

© 1998, Progress Software Corporation 2 Presentation Goals Outline Migration Steps Describe Design Considerations Leverage Existing Double-byte Implementation Describe Impact on 4GL and Report Formats

© 1998, Progress Software Corporation 3 PROGRESS Application Development Suite Powerful tools for the rapid creation of distributed business applications Creates character, GUI, or web-based clients with common source Host-based, client-server, or n-tier distribution on variety of platforms Scalable, robust RDBMS and open International, double-byte enabled

© 1998, Progress Software Corporation 4 Optional n-tier Application Server Possible Configuration Options Database Server Progress Database Host-based Character Client GUI Client Client-Server Web-based Client Other Database

© 1998, Progress Software Corporation 5 Why do our customers need Unicode? Many do not... However, Multinationals deploy across regions with incompatible character sets, yet they must share data between them. Programs are distributed worldwide with one container of text in many languages. Certain applications require multilingual databases. E.g. Translation systems and web-based applications.

© 1998, Progress Software Corporation 6 The Existing Architecture 1.5M lines of C code 0.3M lines of 4GL code Double-byte enabled –CJK, 9 double-byte charsets supported –2-byte only, no 3 or 4-byte –No shift-sequenced charsets –DBE changes earmarked, easy to find –4 years, 3 developers, 2 QA

© 1998, Progress Software Corporation 7 Estimated cost of implementing UCS-2, was very big! Changing to 16-bit text units affects almost every source module –Largest cost is separating char variables based on usage for text or binary data. –Use 16-bit null terminators, ignore 8-bit A 0041, 0000 Ã 0100, 0000 –Pointer arithmetic (advance 2 bytes) –Sizing (bytes or characters) –New API to use new WIDE TEXT datatype

© 1998, Progress Software Corporation 8 Product requirements for a multilingual version Minimize cost for application migration Minimize cost for application upgrade Minimize support cost –One executable! Maintain user-definable character sets Add UTF-8 as just another character set –UTF-8 algorithms are compatible with other charsets

© 1998, Progress Software Corporation 9 Scaled down multilingual proposal: UTF-8 implementation Implement UTF-8 as 3-byte character set –Leverage & extend double-byte enabling –Places to change are already earmarked –Restrict to composed characters for now –Restrict to no surrogates Supports all the markets we are in UTF-8-enable 4GL and RDBMS first –Provides multilingual logic and storage –Java+other client technologies coming

© 1998, Progress Software Corporation 10 Architecture changes UTF-8-enabling the string library N-byte enable character+string functions –GetNextChar, GetPreviousChar –GetCharacterSize (table-based) –Modified IsFirstByte New GetColumnLength New datatype normalized BIG char Minor algorithm changes for efficiency –Find Character

© 1998, Progress Software Corporation 11 Architecture changes UTF-8-enabling character tables String libraries use character tables –Alphanumeric, Lead-byte, Tail-byte –Upper, lower case (700+ characters) New property ColumnCount New table formats –Old architecture presumed 256 byte table –Now organized by range lists and trie Update table compiler & allow hex entry

© 1998, Progress Software Corporation 12 Architecture changes UTF-8-enabling sorting How to sort multilingual data? Binary sort used for double-byte data With UTF-8, Europe is 2-byte, CJK 3-byte Solution –Binary sort on server –Client uses native sort Bump key length limit for UTF-8 Next phase will be enhanced sort

© 1998, Progress Software Corporation 13 Architecture changes Character conversion algorithms Existing, user-definable, conversions –Single-byte character set table maps –Double-byte Shift-JIS - EUCJIS algorithm New table-driven automated conversions –Single-byte to UTF-8, and back –Double-byte to UCS-2 and back –UTF-8 - UCS-2 –Trie for speed and memory optimization Requires significant QA for data integrity

© 1998, Progress Software Corporation 14 Architecture changes Impact on the 4GL user 4GL is character set independent Almost all functions are character-based 3 functions require optional byte-basing –Length, Substring, Overlay –Options: Byte, Character Add new option: Column Format (Picture) Phrase –XXXX has different meaning for UTF-8

© 1998, Progress Software Corporation 15 Status Functioning Well Going to second beta Implemented with very low cost Performance is OK –Metrics not yet available Testing is most significant cost –Reviewing all character set properties –Evaluating all conversions

© 1998, Progress Software Corporation 16 Pièce de Résistance

© 1998, Progress Software Corporation 17 Futures For the Progress International Team –Multilingual Clients –Enhanced Character Folding –Enhanced Sorting For Progress Customers –Deployment of multilingual databases –Worldwide access to these databases –Worldwide deployment of multi-language applications

© 1998, Progress Software Corporation 18 Conclusions Migration can be achieved in phases Migration thru UTF-8 can be low cost Double-byte applications can migrate easily to UTF-8 Asian users can integrate with other languages now Non-English users can integrate with Asian languages now

© 1998, Progress Software Corporation 19 Any questions?