PL/SQL Packages and Types Reference 10g Release 1 (10.1) Part Number B10802-01 |
|
|
View PDF |
UTL_I18N
is a set of services that help developers build multilingual applications. The Globalization Development Kit provides a set of tools that are designed to help developers with minimal experience in internationalization development effectively write multilingual applications.
The chapter contains the following topics:
The UTL_I18N
PL/SQL package consists of the following categories of services:
SHIFT_IN CONSTANT PLS_INTEGER :=0; SHIFT_OUT CONSTANT PLS_INTEGER :=1;
ORACLE_TO_IANA CONSTANT PLS_INTEGER :=0; IANA_TO_ORACLE CONSTANT PLS_INTEGER :=1; MAIL_GENERIC CONSTANT PLS_INTEGER :=0; MAIL_WINDOWS CONSTANT PLS_INTEGER :=1; GENERIC_CONTEXT CONSTANT PLS_INTEGER :=0; MAIL_CONTEXT CONSTANT PLS_INTEGER :=1;
This function provides a way to specify an escape sequence for predefined characters and multibyte characters that cannot be converted to the character set used by an HTML or XML document.
For example, <
(less than symbol) has a special meaning in HTML. To display <
as a character, encode it as the escape sequence <
. In the same way, you can specify how multibyte characters are displayed when they are not part of the character set encoding of an HTML or XML document. For example, if you encode a page in the ZHT16BIG5 character set, then this function checks every character. If it finds a character that is not a Chinese character, then it returns an escape character.
UTL_I18N.ESCAPE_REFERENCE( str IN VARCHAR2 CHARCTER SET ANY_CS, page_cs_name IN VARCHAR2 DEFAULT NULL) RETURN VARCHAR2 CHARACTER SET str%CHARSET;
If the user specifies an invalid character set or a NULL
string, then the function returns a NULL
string.
UTL_I18N.ESCAPE_REFERENCE('ab'||chr(170),'us7ascii')
This returns 'abª'
.
This function returns the default Oracle character set name or the default e-mail safe character set name from an Oracle language name.
See Also:
"MAP_CHARSET Function" for an explanation of an e-mail safe character set |
UTL_I18N.GET_DEFAULT_CHARSET( language IN VARCHAR2, context IN PLS_INTEGER DEFAULT GENERIC_CONTEXT, iswindows IN BOOLEAN DEFAULT FALSE) RETURN VARCHAR2;
If the user specifies an invalid language name or an invalid flag, then the function returns a NULL
string.
UTL_I18N.GET_DEFAULT_CHARSET('French', UTL_I18N.GENERIC_CONTEXT, FALSE)
This returns 'WE8ISO8859P1'
.
UTL_I18N.GET_DEFAULT_CHARSET('French', UTL_I18N.MAIL_CONTEXT, TRUE)
This returns 'WE8MSWIN1252
'.
UTL_I18N.GET_DEFAULT_CHARSET('French', UTL_I18N.MAIL_CONTEXT, FALSE)
This returns 'WE8ISO8859P1
'.
This function:
UTL_I18N.MAP_CHARSET( charset IN VARCHAR2, context IN PLS_INTEGER DEFAULT GENERIC_CONTEXT, flag IN PLS_INTEGER DEFAULT ORACLE_TO_IANA) RETURN VARCHAR2;
An e-mail safe character set is an Oracle character set that is commonly used by applications when they submit e-mail messages. The character set is usually used to convert contents in the database character set to e-mail safe contents. To specify the character set name in the mail header, you should use the corresponding IANA character set name obtained by calling the MAP_CHARSET
function with the ORACLE_TO_IANA
option, providing the e-mail safe character set name as input.
For example, no e-mail client recognizes message contents in the WE8DEC
character set, whose corresponding IANA name is DEC-MCS
. If WE8DEC
is passed to the MAP_CHARSET
function with the MAIL_CONTEXT
option, then the function returns WE8ISO8859P1
. Its corresponding IANA name, ISO-8859-1
, is recognized by most e-mail clients.
The steps in this example are as follows:
MAP_CHARSET
function with the MAIL_CONTEXT | MAIL_GENERIC
option with the database character set name, WE8DEC
. The result is WE8ISO8859P1
.WE8ISO8859P1
.MAP_CHARSET
function with the ORACLE_TO_IANA | GENERIC_CONTEXT
option with the e-mail safe character set, WE8ISO8859P1
. The result is ISO-8859-1
.ISO-8859-1
in the mail header when the e-mail message is submitted.The function returns a character set name if a match is found. If no match is found or if the flag is invalid, then it returns NULL
.
Note: Many Oracle character sets can map to one e-mail safe character set. There is no function that maps an e-mail safe character set to an Oracle character set name. |
UTL_I18N.MAP_CHARSET('iso-8859-1',UTL_I18N.GENERIC_CONTEXT,UTL_I18N.IANA_TO_ ORACLE)
This returns 'WE8ISO8859P1'
.
UTL_I18N.MAP_CHARSET('WE8DEC', utl_i18n.mail_context, utl_i18n.mail_generic)
This returns 'WE8ISO8859P1'
.
See Also:
Oracle Database Globalization Support Guide for a list of valid Oracle character sets |
This function returns an Oracle language name from an ISO locale name.
UTL_I18N.MAP_LANGUAGE_FROM_ISO( isolocale IN VARCHAR2) RETURN VARCHAR2;
Parameter | Description |
---|---|
|
Specifies the ISO locale. The mapping is case-insensitive. |
If the user specifies an invalid locale string, then the function returns a NULL
string.
If the user specifies a locale string that includes only the language (for example, en_
instead of en_US
), then the function returns the default language name for the specified language (for example, American
).
UTL_I18N.MAP_LANGUAGE_FROM_ISO('en_US')
This returns 'American'
.
See Also:
Oracle Database Globalization Support Guide for a list of valid Oracle languages |
This function returns an ISO locale name from an Oracle language name and an Oracle territory name. A valid string must include at least one of the following: a valid Oracle language name or a valid Oracle territory name.
UTL_I18N.MAP_LOCALE_TO_ISO( ora_language IN VARCHAR2, ora_territory IN VARCHAR2) RETURN VARCHAR2;
Parameter | Description |
---|---|
|
Specifies an Oracle language name. It is case-insensitive. |
ora_territory |
Specifies an Oracle territory name. It is case-insensitive. |
If the user specifies an invalid string, then the function returns a NULL
string.
UTL_I18N.MAP_LOCALE_TO_ISO('American','America')
This returns 'en_US'
.
See Also:
Oracle Database Globalization Support Guide for a list of valid Oracle languages and territories |
This function returns an Oracle territory name from an ISO locale.
UTL_I18N.MAP_TERRITORY_FROM_ISO( isolocale IN VARCHAR2) RETURN VARCHAR2;
Parameter | Description |
---|---|
|
Specifies the ISO locale. The mapping is case-insensitive. |
If the user specifies an invalid locale string, then the function returns a NULL
string.
If the user specifies a locale string that includes only the territory (for example, _fr
instead of fr_fr
), then the function returns the default territory name for the specified territory (for example, French
).
UTL_I18N.MAP_TERRITORY_FROM_ISO('en_US')
This returns 'America'
.
See Also:
Oracle Database Globalization Support Guide for a list of valid Oracle territories |
This function converts RAW
data from a valid Oracle character set to a VARCHAR2
string in the database character set.
The function is overloaded. The different forms of functionality are described along with the syntax declarations.
Buffer Conversion:
UTL_I18N.RAW_TO_CHAR( data IN RAW, src_charset IN VARCHAR2 DEFAULT NULL) RETURN VARCHAR2;
Piecewise conversion converts raw data into character data piece by piece:
UTL_I18N.RAW_TO_CHAR( data IN RAW, src_charset IN VARCHAR2 DEFAULT NULL, scanned_length OUT PLS_INTEGER, shift_status IN OUT PLS_INTEGER) RETURN VARCHAR2;
If the user specifies an invalid character set, NULL
data, or data whose length is 0, then the function returns a NULL
string.
UTL_I18N.RAW_TO_CHAR(hextoraw('616263646566C2AA'), 'utf8')
This returns the following string in the database character set:
'abcde'||chr(170)
UTL_I18N.RAW_TO_CHAR(hextoraw('616263646566C2AA'),'utf8',shf,slen)
This expression returns the following string in the database character set:
'abcde'||chr(170)
It also sets shf
to SHIFT_IN
and slen
to 8
.
The following example converts data from the Internet piece by piece to the database character set.
rvalue RAW(1050); nvalue VARCHAR2(1024); conversion_state PLS_INTEGER = 0; converted_len PLS_INTEGER; rtemp RAW(10) = ''; conn utl_tcp.connection; tlen PLS_INTEGER; ... conn := utl_tcp.open_connection ( remote_host => 'localhost', remote_port => 2000); LOOP tlen := utl_tcp.read_raw(conn, rvalue, 1024); rvalue := utl_raw.concat(rtemp, rvalue); nvalue := utl_i18n.raw_to_char(rvalue, 'JA16SJIS', converted_len, conversion_stat); if (converted_len < utl_raw.length(rvalue) ) then rtemp := utl_raw.substr(rvalue, converted_len+1); else rtemp := ''; end if; /* do anything you want with nvalue */ /* e.g htp.prn(nvalue); */ END LOOP; utl_tcp.close_connection(conn); EXCEPTION WHEN utl_tcp.end_of_input THEN utl_tcp.close_connection(conn);
END;
This function converts RAW
data from a valid Oracle character set to an NVARCHAR2
string in the national character set.
The function is overloaded. The different forms of functionality are described along with the syntax declarations.
Buffer Conversion:
UTL_I18N.RAW_TO_NCHAR( data IN RAW, src_charset IN VARCHAR2 DEFAULT NULL) RETURN NVARCHAR2;
Piecewise conversion converts raw data into character data piece by piece:
UTL_I18N.RAW_TO_NCHAR( data IN RAW, src_charset IN VARCHAR2 DEFAULT NULL, scanned_length OUT PLS_INTEGER, shift_status IN OUT PLS_INTEGER)
RETURN NVARCHAR2;
If the user specifies an invalid character set, NULL
data, or data whose length is 0, then the function returns a NULL
string.
UTL_I18N.RAW_TO_NCHAR(hextoraw('616263646566C2AA'),'utf8')
This returns the following string in the national character set:
'abcde'||chr(170)
UTL_I18N.RAW_TO_NCHAR(hextoraw('616263646566C2AA'),'utf8', shf, slen)
This expression returns the following string in the national character set:
'abcde'||chr(170)
It also sets shf
to SHIFT_IN
and slen
to 8
.
The following example converts data from the Internet piece by piece to the national character set.
rvalue RAW(1050); nvalue NVARCHAR2(1024); converstion_state PLS_INTEGER = 0; converted_len PLS_INTEGER; rtemp RAW(10) = ''; conn utl_tcp.connection; tlen PLS_INTEGER; ... conn := utl_tcp.open_connection ( remote_host => 'localhost', remote_port => 2000); LOOP tlen := utl_tcp.read_raw(conn, rvalue, 1024); rvalue := utl_raw.concat(rtemp, rvalue); nvalue := utl_i18n.raw_to_nchar(rvalue, 'JA16SJIS', converted_len, conversion_stat); if (converted_len < utl_raw.length(rvalue) ) then rtemp := utl_raw.substr(rvalue, converted_len+1); else rtemp := ''; end if; /* do anything you want with nvalue */ /* e.g htp.prn(nvalue); */ END LOOP; utl_tcp.close_connection(conn); EXCEPTION WHEN utl_tcp.end_of_input THEN utl_tcp.close_connection(conn); END;
This function converts a VARCHAR2
or NVARCHAR2
string to another valid Oracle character set and returns the result as RAW
data.
UTL_I18N.STRING_TO_RAW( data IN VARCHAR2 CHARACTER SET ANY_CS, dst_charset IN VARCHAR2 DEFAULT NULL) RETURN RAW;
If the user specifies an invalid character set, a NULL
string, or a string whose length is 0, then the function returns a NULL
string.
DECLARE r raw(50); s varchar2(20); BEGIN s:='abcdef'||chr(170); r:=utl_i18n.string_to_raw(s,'utf8'); dbms_output.put_line(rawtohex(r)); end; /
This returns a hex value of '616263646566C2AA'
.
This function returns a string from an input string that contains escape sequences. It decodes each escape sequence to the corresponding character value.
See Also:
"ESCAPE_REFERENCE Function" for more information about escape sequences |
UTL_I18N.UNESCAPE_REFERENCE( str IN VARCHAR2 CHARACTER SET ANY_CS) RETURN VARCHAR2 CHARACTER SET str%CHARSET;
Parameter | Description |
---|---|
|
Specifies the input string |
If the user specifies a NULL
string or a string whose length is 0, then the function returns a NULL
string. If the function fails, then it returns the original string.
UTL_I18N.UNESCAPE_REFERENCE('abª')
This returns 'ab'||chr(170)
.