We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published bySara Crawford
Modified over 3 years ago
© 1998, Progress Software Corporation 1 Migration of a 4GL and Relational Database to Unicode Tex Texin International Product Manager
© 1998, Progress Software Corporation 2 Presentation Goals Outline Migration Steps Describe Design Considerations Leverage Existing Double-byte Implementation Describe Impact on 4GL and Report Formats
© 1998, Progress Software Corporation 3 PROGRESS Application Development Suite Powerful tools for the rapid creation of distributed business applications Creates character, GUI, or web-based clients with common source Host-based, client-server, or n-tier distribution on variety of platforms Scalable, robust RDBMS and open International, double-byte enabled
© 1998, Progress Software Corporation 4 Optional n-tier Application Server Possible Configuration Options Database Server Progress Database Host-based Character Client GUI Client Client-Server Web-based Client Other Database
© 1998, Progress Software Corporation 5 Why do our customers need Unicode? Many do not... However, Multinationals deploy across regions with incompatible character sets, yet they must share data between them. Programs are distributed worldwide with one container of text in many languages. Certain applications require multilingual databases. E.g. Translation systems and web-based applications.
© 1998, Progress Software Corporation 6 The Existing Architecture 1.5M lines of C code 0.3M lines of 4GL code Double-byte enabled –CJK, 9 double-byte charsets supported –2-byte only, no 3 or 4-byte –No shift-sequenced charsets –DBE changes earmarked, easy to find –4 years, 3 developers, 2 QA
© 1998, Progress Software Corporation 7 Estimated cost of implementing UCS-2, was very big! Changing to 16-bit text units affects almost every source module –Largest cost is separating char variables based on usage for text or binary data. –Use 16-bit null terminators, ignore 8-bit A 0041, 0000 Ã 0100, 0000 –Pointer arithmetic (advance 2 bytes) –Sizing (bytes or characters) –New API to use new WIDE TEXT datatype
© 1998, Progress Software Corporation 8 Product requirements for a multilingual version Minimize cost for application migration Minimize cost for application upgrade Minimize support cost –One executable! Maintain user-definable character sets Add UTF-8 as just another character set –UTF-8 algorithms are compatible with other charsets
© 1998, Progress Software Corporation 9 Scaled down multilingual proposal: UTF-8 implementation Implement UTF-8 as 3-byte character set –Leverage & extend double-byte enabling –Places to change are already earmarked –Restrict to composed characters for now –Restrict to no surrogates Supports all the markets we are in UTF-8-enable 4GL and RDBMS first –Provides multilingual logic and storage –Java+other client technologies coming
© 1998, Progress Software Corporation 10 Architecture changes UTF-8-enabling the string library N-byte enable character+string functions –GetNextChar, GetPreviousChar –GetCharacterSize (table-based) –Modified IsFirstByte New GetColumnLength New datatype normalized BIG char Minor algorithm changes for efficiency –Find Character
© 1998, Progress Software Corporation 11 Architecture changes UTF-8-enabling character tables String libraries use character tables –Alphanumeric, Lead-byte, Tail-byte –Upper, lower case (700+ characters) New property ColumnCount New table formats –Old architecture presumed 256 byte table –Now organized by range lists and trie Update table compiler & allow hex entry
© 1998, Progress Software Corporation 12 Architecture changes UTF-8-enabling sorting How to sort multilingual data? Binary sort used for double-byte data With UTF-8, Europe is 2-byte, CJK 3-byte Solution –Binary sort on server –Client uses native sort Bump key length limit for UTF-8 Next phase will be enhanced sort
© 1998, Progress Software Corporation 13 Architecture changes Character conversion algorithms Existing, user-definable, conversions –Single-byte character set table maps –Double-byte Shift-JIS - EUCJIS algorithm New table-driven automated conversions –Single-byte to UTF-8, and back –Double-byte to UCS-2 and back –UTF-8 - UCS-2 –Trie for speed and memory optimization Requires significant QA for data integrity
© 1998, Progress Software Corporation 14 Architecture changes Impact on the 4GL user 4GL is character set independent Almost all functions are character-based 3 functions require optional byte-basing –Length, Substring, Overlay –Options: Byte, Character Add new option: Column Format (Picture) Phrase –XXXX has different meaning for UTF-8
© 1998, Progress Software Corporation 15 Status Functioning Well Going to second beta Implemented with very low cost Performance is OK –Metrics not yet available Testing is most significant cost –Reviewing all character set properties –Evaluating all conversions
© 1998, Progress Software Corporation 16 Pièce de Résistance
© 1998, Progress Software Corporation 17 Futures For the Progress International Team –Multilingual Clients –Enhanced Character Folding –Enhanced Sorting For Progress Customers –Deployment of multilingual databases –Worldwide access to these databases –Worldwide deployment of multi-language applications
© 1998, Progress Software Corporation 18 Conclusions Migration can be achieved in phases Migration thru UTF-8 can be low cost Double-byte applications can migrate easily to UTF-8 Asian users can integrate with other languages now Non-English users can integrate with Asian languages now
© 1998, Progress Software Corporation 19 Any questions?
Enterprise Java and Data Services Designing for Broadly Available Grid Data Access Services.
©Ian Sommerville 2007Change Management Slide 1 Software change management.
1 tRelational/DPS Overview. 2 ADABAS Data Transfer: business needs and issues tRelational & DPS Overview Summary Questions? Demo Agenda.
Beyond Text Representation Building on Unicode to Implement a Multilingual Text Analysis Framework Thomas Hampp – IBM Germany Content Management Development.
DATABASE SYSTEM CONCEPTS AND ARCHITECTURE CHAPTER 2 1.
The creation of "Yaolan.com" A Site for Pre-natal and Parenting Education in Chinese by James Caldwell DAE Interactive Marketing a Web Connection Company.
©Ian Sommerville 2000Software Engineering, 6th edition. Chapter 29Slide 1 Configuration management l Managing the products of system change l Objectives.
Addition 1’s to
0 - 0.
ICS 434 Advanced Database Systems Dr. Abdallah Al-Sukairi Second Semester (032) King Fahd University of Petroleum & Minerals.
18 th International Unicode Conference Documentum Proprietary 1 18 th International Unicode Conference Documentum and UTF-8: Converting Content Management.
From UCS-2 to UTF-16 Discussion and practical example for the transition of a Unicode library from UCS-2 to UTF-16.
ITCR Success through Innovation iTCR Success through Innovation CiTRs DECADE Strategy ä DECADE vision integrated electronic customer access.
UNICODE Character Sets and Coding Standards Han Unification and ISO10646 Encoding Evolution and Unicode Programming Unicode.
Component-Based Software Engineering Main issues: assemble systems out of (reusable) components compatibility of components.
Case Study: Examining the Results of P2P Collaboration at PricewaterhouseCoopers February 14, 2001 Case Study: Examining the Results of Collaboration at.
SAP Overview SAP? Company ERP Software package –R/2 –R/3.
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 15 Programming and Languages: Telling the Computer What to Do.
MISSION CRITICAL COMPUTING SQL Server Special Considerations.
Object Oriented Databases by Adam Stevenson. Object Databases Became commercially popular in mid 1990’s Became commercially popular in mid 1990’s You.
DB Relay An Introduction. INSPIRATION Database access is WAY TOO HARD The crux.
Page 1 SaaS – BUSINESS MODEL Debmalya Khan DEBMALYA KHAN.
A View of the Business with Drillable Graphics Southern Computer Measurement Group May, 2012 Martha Hays.
Electronic Filing System Proposal Texas Ethics Commission January 31, 2013 Texas Ethics Commission January 31, 2013.
© Copyright 2007 Exempler Telecom Test Automation System Exempler - We pride ourselves with providing lightweight robust engineering solutions.
Review1 What is multilingual computing? Bilingual, trilingual, vs. Multilingual What are the fundamental issues in multi-lingual computing? –Representation.
21 st International Unicode Conference Dublin, Ireland, May Folded Trie: Efficient Data Structure for All of Unicode Vladimir Weinstein
WEEK 1 You have 10 seconds to name…
Extending Eclipse CDT for Remote Target Debugging Thomas Fletcher Director, Automotive Engineering Services QNX Software Systems.
1 1999/Ph 514: Channel Access Concepts EPICS Channel Access Concepts Bob Dalesio LANL.
Making the System Operational
Introduction to Product Family Engineering. 11 Oct 2002 Ver 2.0 ©Copyright 2002 Vortex System Concepts 2 Product Family Engineering Overview Project Engineering.
1 Copyright © 2005, Oracle. All rights reserved. Introducing the Java and Oracle Platforms.
1 Capability Set - Detail. 2 Common Content Problems Content Mayhem –File management and storage confusion Content Multiplication –Editing déjà vu - same.
| Copyright © 2009 Juniper Networks, Inc. | 1 WX Client Rajoo Nagar PLM, WABU.
Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:
Chapter 1: The Database Environment
1 Chapter 11: Data Centre Administration Objectives Data Centre Structure Data Centre Structure Data Centre Administration Data Centre Administration Data.
1 Chapter 9: The Client/Server Database Environment Modern Database Management 7 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Copyrighted material John Tullis 8/13/2015 page 1 Blaze Software John Tullis DePaul Instructor
MNCS: DMA Extensions for Multinational Character Strings DMA Technical Committee Integration Subcommittee June 16, 1999 [ notes]
Cross Language Clone Analysis Team 2 February 3, 2011.
Credit hours: 4 Contact hours: 50 (30 Theory, 20 Lab) Prerequisite: TB143 Introduction to Personal Computers.
ASCII and Unicode. Learning Outcomes Terms Outline ASCII Code Unicode system – Discuss the Unicode’s main objective within computer processing Computer.
Creators of GuiXT Supporting All NetWeaver Business Clients with a single comprehensive solution. Carlos A. Noriega Aparna Desai.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Chapter 11 Introduction to Programming in C
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
© 2017 SlidePlayer.com Inc. All rights reserved.