Presentation is loading. Please wait.

Presentation is loading. Please wait.

Oracle Character sets Aino Andriessen

Similar presentations


Presentation on theme: "Oracle Character sets Aino Andriessen"— Presentation transcript:

1 Oracle Character sets Aino Andriessen
NAAM Oracle Character sets Aino Andriessen

2 Demo1

3 demo1.sql rem rem name demo1.sql rem created jan 18, 2009 rem purpose 1e script ter demonstratie van de nls_length_semantics parameter op tabellen rem remarks desc vd demo2 tabel zou varchar2(4 char) laten zien, maar dat wil ik nog niet. Daarom een select statement gemaakt wat er op lijkt. cl scr set echo off set pagesize 100 set feedback off drop table demo; drop table demo2; -- Create demo2 table with CHAR create table demo2 (naam varchar2(4 char)); prompt Table demo2 created prompt prompt desc demo2 COLUMN Name FORMAT A42 COLUMN Null FORMAT A8 COLUMN Type FORMAT A27 select column_name Name, Null, data_type || '(' || char_length || ')' Type from cols where table_name = 'DEMO2'; set feedback on pause prompt insert into demo2 values ('Rene'); insert into demo2 values ('Rene'); pause; prompt insert into demo2 values ('René'); insert into demo2 values ('René'); commit; prompt select * from demo2; spool demo1.log select * from demo2; spool off -- Create demo table according to the default (BYTE) create table demo (naam varchar2(4)); prompt Table demo created prompt desc demo desc demo prompt insert into demo values ('Rene'); insert into demo values ('Rene'); prompt insert into demo values ('René'); insert into demo values ('René'); prompt select * from demo; select * from demo; /* select parameter, value from nls_database_parameters where parameter = 'NLS_LENGTH_SEMANTICS'; */

4 nls_length_semantics
Intializatie parameter CHAR of BYTE (default) Van toepassing op multi byte character sets Definieert het type voor de lengte van character kolommen en variabelen alter session set nls_length_semantics=CHAR; niet met terugwerkende kracht ev pl/sql recompile alter system

5 nls_length_semantics 2
lengte van karakter kolommen en variabelen expliciet opgeven create table demo (naam varchar2(4 char)) create table demo (naam varchar2(4 byte)) t_naam varchar2(4 char); t_naam demo2.naam%TYPE

6 Demo2

7 demo2.sql rem rem name demo2.sql rem created jan 18, 2009 rem purpose 2e script ter demonstratie van de nls_length_semantics parameter op pl/sql rem remarks declare t_naam varchar2(4); t_naamC demo2.naam%TYPE; r_demo2 demo2%ROWTYPE; cursor c_demo2 is select naam from demo2; begin for r_demo2 in c_demo2 loop dbms_output.put_line (r_demo2.naam); t_naamc := r_demo2.naam; dbms_output.put_line (t_naamc); t_naam := r_demo2.naam; dbms_output.put_line (t_naam); end loop; end; /

8 Character encoding

9 Character set Character set definieert de 'mapping' tussen binary/headecimale code en het character UTF8 WE8MSWIN1252 WE8ISO8859P1 JA16EUC US7ASCII WE8DEC ... Code pages IBM / windows terminologie ~ analoog met character set code page per language

10 Character sets 2 ASCII ISO 8859 en latin-1 CP-1252 UTF8 1 byte
128 karakters standaard letters uit het engels zonder accenten ISO 8859 en latin-1 1 byte (8 bit) 256 karakters CP-1252 Windows variant op latin 1 UTF8 variabel, multibyte max 4 bytes ~ karakters ~1 miljoen beschikbaar meertalig ascii codes zijn gelijk

11 Voorbeelden Character Set Hexadecimale code - Euro AL32UTF8 E282AC
WE8MSWIN1252 80 ASCII - WE8ISO8859P1 WE8ISO8859P15 164 Character Set Hexadecimale code - é AL32UTF8 C3A9 (50089) WE8MSWIN1252 E9 (233) ASCII - WE8ISO8859P1 E9 WE8ISO8859P15

12 Unicode / UTF 8 example The image shows the number of bytes needed to store different kinds of characters in the UTF-8 character set. The ASCII characters (C, t, and d) require one byte. The Latin and Greek characters (á, ö, and Ø) require 2 bytes. The Asian character requires 3 bytes. The supplementary character (treble clef sign) requires 4 bytes of storage.

13 Diakrieten en speciale tekens
Diakrieten zijn accenten die bij (boven, onder of zelfs door) een letter gezet worden om de uitspraak van een letter te veranderen en daarmee taaleigen klanken van een (gewijzigde) letter te voorzien. àÿęňĜş etc. Speciale tekens ßæ¿

14 Diakrieten en speciale tekens
Single byte character sets 1 byte voor samengesteld karakter Niet alle combinaties mogelijk code pages UTF-8 diakriet heeft eigen codering samengesteld karakter heeft eigen codering meestal (altijd) samenstelling van oorspronkelijke karakter + diakriet

15 Database functies Character functies chr (n) dump convert utl_raw
substr - substrb - substrc - substr2 instr - ... length - lengthb chr (n) Returns a character corresponding to the number passed in as the argument in the database character set select chr (50089) from dual; dump Returns a VARCHAR2 value containing the datatype code, length in bytes, and internal representation of expr. The returned result is always in the database character set. select dump (naam, 1017) from demo2; convert Converts a character string from one character set to another utl_raw select utl_raw.cast_to_raw(naam) from demo2; unistr() Converts the characters in x to the national language character set select (unistr('Ren\00e9')) from dual;

16 Demo3

17 demo3.sql rem rem name demo3.sql rem created jan 18, 2009 rem purpose 3e script ter demonstratie van diverse character sets functies rem remarks select value from nls_database_parameters where parameter = 'NLS_CHARACTERSET'; select chr (50089) from dual; select dump (naam, 1017) from demo2; select utl_raw.cast_to_raw(naam) from demo2; select substr (naam,1,4) from demo2; select substrb (naam,1,4) from demo2; select '*' || substrb (naam,1,4) || '*' from demo2; select utl_raw.cast_to_raw (substrb (naam,1,4)) from demo2; select naam, length (naam) from demo2; select naam, lengthb (naam) from demo2;

18 nls_lang Client character set
When the client NLS_LANG character set is set to the same value as the database character set, Oracle assumes that the data being sent or received are of the same (correct) encoding, so no conversions or validations may occur for performance reasons. The data is just stored as delivered by the client, bit by bit.

19 nls lang 2 language_country.character set
american_america.UTF8 dutch_the netherlands.WE8MSWIN1252 american_THE NETHERLANDS.WE8MSWIN1252 Environment variable, nls_lang Verschil in Windows GUI (WE8MSWIN1252) en command line (WE8PC850) Wordt niet door Java clients gebruikt Demo 4

20 Demo4

21 demo4.bat rem rem name demo4.bat rem create jan 18, 2008 AA rem purpose Set the nls_lang parameter to the one that is used in the dos window rem remarks Only use to select and insert from the command line. rem Do not run scripts because they are in another character set / code page which is different fdrom the one in the dos box rem If you'll run these scripts, unexpected character conversion occurs, resulting in weird, unexpected, characters rem local NLS_LANG : rem DUTCH_THE NETHERLANDS.WE8MSWIN1252 rem AMERICAN_THE NETHERLANDS.WE8MSWIN1252 rem AMERICAN_THE NETHERLANDS.WE8PC850 set NLS_LANG=AMERICAN_THE NETHERLANDS.WE8PC850

22 National character set
Support for another character set next to the database character set e.g to allow japanese in a MSWIN1252 or ISO8859 character set Less necessary in a UTF8 database Multibyte nvarchar, nclob etc.

23 Case TELETEX karakterset Locale builder bestaat niet meer in Oracle
select convert(naam,’TELETEX’,’UTF8’) from tabel; Locale builder

24 Oracle Locale builder .nlb in ORA_NLS33 directory
SQL> select convert(‘test’,’TELETEX’,’UTF8’) from dual; Oracle Locale Builder LXINST LX22711.NLT LX22711.NLB LX0BOOT.NLT

25 sql> select name from emp
sql> select name from sql> select utl_raw.cast_to_varchar (utl_raw.cast_to_raw (name)) from sql> select utl_raw.cast_to_varchar (name)) from

26 Vraag Diacrietloos zoeken Case insensitive zoeken Oracle Intermedia

27 Summary nls_lenght_semantics
Always explicitly define a character column with its type (CHAR or BYTE) Oracle performs automatic character set conversion wysinawyg Use a Java client Working with character sets can be confusing UTF8 is often the preferred character set

28 Referenties Unicode en Ultraedit nls_lang Oracle globalization support
nls_lang Oracle globalization support Wikipedia


Download ppt "Oracle Character sets Aino Andriessen"

Similar presentations


Ads by Google