Skip Headers

Oracle® Text Reference
10g Release 1 (10.1)

Part Number B10730-02
Go to Documentation Home
Home
Go to Book List
Book List
Go to Index
Index
Go to Master Index
Master Index
Go to Feedback page
Feedback

Go to next page
Next
View PDF

Contents

List of Tables

Title and Copyright Information

Send Us Your Comments

Preface

Audience
Documentation Accessibility
Structure
Related Documentation
Conventions

What's New in Oracle Text?

Oracle Database 10g R1 New Features
Security Improvements
Classification and Clustering
Indexing
Language Features
Querying
Document Services

1 Oracle Text SQL Statements and Operators

ALTER INDEX
ALTER TABLE: Supported Partitioning Statements
CATSEARCH
CONTAINS
CREATE INDEX
DROP INDEX
MATCHES
MATCH_SCORE
SCORE

2 Oracle Text Indexing Elements

2.1 Overview
2.1.1 Creating Preferences
2.2 Datastore Types
2.2.1 DIRECT_DATASTORE
2.2.1.1 DIRECT_DATASTORE CLOB Example
2.2.2 MULTI_COLUMN_DATASTORE
2.2.2.1 Indexing and DML
2.2.2.2 MULTI_COLUMN_DATASTORE Example
2.2.2.3 MULTI_COLUMN_DATASTORE Filter Example
2.2.2.4 Tagging Behavior
2.2.2.5 Indexing Columns as Sections
2.2.3 DETAIL_DATASTORE
2.2.3.1 Synchronizing Master/Detail Indexes
2.2.3.2 Example Master/Detail Tables
2.2.4 FILE_DATASTORE
2.2.4.1 PATH Attribute Limitations
2.2.4.2 FILE_DATASTORE Example
2.2.5 URL_DATASTORE
2.2.5.1 URL Syntax
2.2.5.2 URL_DATASTORE Attributes
2.2.5.3 URL_DATASTORE Example
2.2.6 USER_DATASTORE
2.2.6.1 Constraints
2.2.6.2 Editing Procedure after Indexing
2.2.6.3 USER_DATASTORE with CLOB Example
2.2.6.4 USER_DATASTORE with BLOB_LOC Example
2.2.7 NESTED_DATASTORE
2.2.7.1 NESTED_DATASTORE Example
2.3 Filter Types
2.3.1 CHARSET_FILTER
2.3.1.1 UTF-16 Big- and Little-Endian Detection
2.3.1.2 Indexing Mixed-Character Set Columns
2.3.2 INSO_FILTER
2.3.2.1 Indexing Formatted Documents
2.3.2.2 Explicitly Bypassing Plain Text or HTML in Mixed Format Columns
2.3.2.3 Character Set Conversion With Inso
2.3.3 NULL_FILTER
2.3.3.1 Indexing HTML Documents
2.3.4 MAIL_FILTER
2.3.4.1 Filter Behavior
2.3.4.2 About the Mail Filter Configuration File
2.3.5 USER_FILTER
2.3.5.1 User Filter Example
2.3.6 PROCEDURE_FILTER
2.3.6.1 Parameter Order
2.3.6.2 Procedure Filter Execute Requirements
2.3.6.3 Error Handling
2.3.6.4 Procedure Filter Preference Example
2.4 Lexer Types
2.4.1 BASIC_LEXER
2.4.1.1 Stemming User-Dictionaries
2.4.1.2 BASIC_LEXER Example
2.4.2 MULTI_LEXER
2.4.2.1 Multi-language Stoplists
2.4.2.2 MULTI_LEXER Example
2.4.2.3 Querying Multi-Language Tables
2.4.3 CHINESE_VGRAM_LEXER
2.4.3.1 Character Sets
2.4.4 CHINESE_LEXER
2.4.4.1 Customizing the Chinese Lexicon
2.4.5 JAPANESE_VGRAM_LEXER
2.4.5.1 JAPANESE_VGRAM_LEXER Attribute
2.4.5.2 JAPANESE_VGRAM_LEXER Character Sets
2.4.6 JAPANESE_LEXER
2.4.6.1 Customizing the Japanese Lexicon
2.4.6.2 JAPANESE_LEXER Attribute
2.4.6.3 JAPANESE LEXER Character Sets
2.4.6.4 Japanese Lexer Example
2.4.7 KOREAN_LEXER
2.4.7.1 KOREAN_LEXER Character Sets
2.4.7.2 KOREAN_LEXER Attributes
2.4.7.3 Limitations
2.4.8 KOREAN_MORPH_LEXER
2.4.8.1 Supplied Dictionaries
2.4.8.2 Supported Character Sets
2.4.8.3 Unicode Support
2.4.8.4 KOREAN_MORPH_LEXER Attributes
2.4.8.5 Limitations
2.4.8.6 KOREAN_MORPH_LEXER Example: Setting Composite Attribute
2.4.9 USER_LEXER
2.4.9.1 Limitations
2.4.9.2 USER_LEXER Attributes
2.4.9.3 INDEX_PROCEDURE
2.4.9.4 INPUT_TYPE
2.4.9.5 QUERY_PROCEDURE
2.4.9.6 Encoding Tokens as XML
2.4.9.7 XML Schema for No-Location, User-defined Indexing Procedure
2.4.9.8 XML Schema for User-defined Indexing Procedure with Location
2.4.9.9 XML Schema for User-defined Lexer Query Procedure
2.4.10 WORLD_LEXER
2.4.10.1 WORLD_LEXER Example
2.5 Wordlist Type
2.5.1 BASIC_WORDLIST
2.5.2 BASIC_WORDLIST Example
2.5.2.1 Enabling Fuzzy Matching and Stemming
2.5.2.2 Enabling Sub-string and Prefix Indexing
2.5.2.3 Setting Wildcard Expansion Limit
2.6 Storage Types
2.6.1 BASIC_STORAGE
2.6.1.1 Storage Default Behavior
2.6.1.2 Storage Example
2.7 Section Group Types
2.7.1 Section Group Examples
2.7.1.1 Creating Section Groups in HTML Documents
2.7.1.2 Creating Sections Groups in XML Documents
2.7.1.3 Automatic Sectioning in XML Documents
2.8 Classifier Types
2.8.1 RULE_CLASSIFIER
2.8.2 SVM_CLASSIFIER
2.9 Cluster Types
2.9.1 KMEAN_CLUSTERING
2.10 Stoplists
2.10.1 Multi-Language Stoplists
2.10.2 Creating Stoplists
2.10.3 Modifying the Default Stoplist
2.10.3.1 Dynamic Addition of Stopwords
2.11 System-Defined Preferences
2.11.1 Data Storage
2.11.1.1 CTXSYS.DEFAULT_DATASTORE
2.11.1.2 CTXSYS.FILE_DATASTORE
2.11.1.3 CTXSYS.URL_DATASTORE
2.11.2 Filter
2.11.2.1 CTXSYS.NULL_FILTER
2.11.2.2 CTXSYS.INSO_FILTER
2.11.3 Lexer
2.11.3.1 CTXSYS.DEFAULT_LEXER
2.11.3.2 CTXSYS.BASIC_LEXER
2.11.4 Section Group
2.11.4.1 CTXSYS.NULL_SECTION_GROUP
2.11.4.2 CTXSYS.HTML_SECTION_GROUP
2.11.4.3 CTXSYS.AUTO_SECTION_GROUP
2.11.4.4 CTXSYS.PATH_SECTION_GROUP
2.11.5 Stoplist
2.11.5.1 CTXSYS.DEFAULT_STOPLIST
2.11.5.2 CTXSYS.EMPTY_STOPLIST
2.11.6 Storage
2.11.6.1 CTXSYS.DEFAULT_STORAGE
2.11.7 Wordlist
2.11.7.1 CTXSYS.DEFAULT_WORDLIST
2.12 System Parameters
2.12.1 General System Parameters
2.12.2 Default Index Parameters
2.12.2.1 CONTEXT Index Parameters
2.12.2.2 CTXCAT Index Parameters
2.12.2.3 CTXRULE Index Parameters
2.12.2.4 Viewing Default Values
2.12.2.5 Changing Default Values

3 Oracle Text CONTAINS Query Operators

3.1 Operator Precedence
3.1.1 Group 1 Operators
3.1.2 Group 2 Operators and Characters
3.1.3 Procedural Operators
3.1.4 Precedence Examples
3.1.5 Altering Precedence
ABOUT
ACCUMulate ( , )
AND (&)
Broader Term (BT, BTG, BTP, BTI)
EQUIValence (=)
Fuzzy
HASPATH
INPATH
MDATA
MINUS (-)
Narrower Term (NT, NTG, NTP, NTI)
NEAR (;)
NOT (~)
OR (|)
Preferred Term (PT)
Related Term (RT)
soundex (!)
stem ($)
Stored Query Expression (SQE)
SYNonym (SYN)
threshold (>)
Translation Term (TR)
Translation Term Synonym (TRSYN)
Top Term (TT)
weight (*)
wildcards (% _)
WITHIN

4 Special Characters in Oracle Text Queries

4.1 Grouping Characters
4.2 Escape Characters
4.2.1 Querying Escape Characters
4.3 Reserved Words and Characters

5 CTX_ADM Package

RECOVER
SET_PARAMETER

6 CTX_CLS Package

TRAIN
CLUSTERING

7 CTX_DDL Package

ADD_ATTR_SECTION
ADD_FIELD_SECTION
ADD_INDEX
ADD_MDATA
ADD_MDATA_SECTION
ADD_SPECIAL_SECTION
ADD_STOPCLASS
ADD_STOP_SECTION
ADD_STOPTHEME
ADD_STOPWORD
ADD_SUB_LEXER
ADD_ZONE_SECTION
COPY_POLICY
CREATE_INDEX_SET
CREATE_POLICY
CREATE_PREFERENCE
CREATE_SECTION_GROUP
CREATE_STOPLIST
DROP_INDEX_SET
DROP_POLICY
DROP_PREFERENCE
DROP_SECTION_GROUP
DROP_STOPLIST
OPTIMIZE_INDEX
REMOVE_INDEX
REMOVE_MDATA
REMOVE_SECTION
REMOVE_STOPCLASS
REMOVE_STOPTHEME
REMOVE_STOPWORD
REPLACE_INDEX_METADATA
SET_ATTRIBUTE
SYNC_INDEX
UNSET_ATTRIBUTE
UPDATE_POLICY

8 CTX_DOC Package

FILTER
GIST
HIGHLIGHT
IFILTER
MARKUP
PKENCODE
POLICY_FILTER
POLICY_GIST
POLICY_HIGHLIGHT
POLICY_MARKUP
POLICY_THEMES
POLICY_TOKENS
SET_KEY_TYPE
THEMES
TOKENS

9 CTX_OUTPUT Package

ADD_EVENT
ADD_TRACE
END_LOG
END_QUERY_LOG
GET_TRACE_VALUE
LOG_TRACES
LOGFILENAME
REMOVE_EVENT
REMOVE_TRACE
RESET_TRACE
START_LOG
START_QUERY_LOG

10 CTX_QUERY Package

BROWSE_WORDS
COUNT_HITS
EXPLAIN
HFEEDBACK
REMOVE_SQE
STORE_SQE

11 CTX_REPORT

11.1 Procedures in CTX_REPORT
11.2 Using the Function Versions
DESCRIBE_INDEX
DESCRIBE_POLICY
CREATE_INDEX_SCRIPT
CREATE_POLICY_SCRIPT
INDEX_SIZE
INDEX_STATS
QUERY_LOG_SUMMARY
TOKEN_INFO
TOKEN_TYPE

12 CTX_THES Package

ALTER_PHRASE
ALTER_THESAURUS
BT
BTG
BTI
BTP
CREATE_PHRASE
CREATE_RELATION
CREATE_THESAURUS
CREATE_TRANSLATION
DROP_PHRASE
DROP_RELATION
DROP_THESAURUS
DROP_TRANSLATION
HAS_RELATION
NT
NTG
NTI
NTP
OUTPUT_STYLE
PT
RT
SN
SYN
THES_TT
TR
TRSYN
TT
UPDATE_TRANSLATION

13 CTX_ULEXER Package

WILDCARD_TAB

14 Oracle Text Executables

14.1 Thesaurus Loader (ctxload)
14.1.1 Text Loading
14.1.2 ctxload Syntax
14.1.2.1 Mandatory Arguments
14.1.2.2 Optional Arguments
14.1.3 ctxload Examples
14.1.3.1 Thesaurus Import Example
14.1.3.2 Thesaurus Export Example
14.2 Knowledge Base Extension Compiler (ctxkbtc)
14.2.1 Knowledge Base Character Set
14.2.2 ctxkbtc Syntax
14.2.3 ctxkbtc Usage Notes
14.2.4 ctxkbtc Limitations
14.2.5 ctxkbtc Constraints on Thesaurus Terms
14.2.6 ctxkbtc Constraints on Thesaurus Relations
14.2.7 Extending the Knowledge Base
14.2.7.1 Example for Extending the Knowledge Base
14.2.8 Adding a Language-Specific Knowledge Base
14.2.8.1 Limitations for Adding a Knowledge Base
14.2.9 Order of Precedence for Multiple Thesauri
14.2.10 Size Limits for Extended Knowledge Base
14.3 Lexical Compiler (ctxlc)
14.3.1 Syntax of ctxlc
14.3.1.1 Mandatory Arguments
14.3.1.2 Optional Arguments
14.3.2 Performance Considerations
14.3.3 ctxlc Usage Notes
14.3.4 Example

15 Oracle Text Alternative Spelling

15.1 Overview of Alternative Spelling Features
15.1.1 Alternate Spelling
15.1.2 Base-Letter Conversion
15.1.2.1 Generic Versus Language-Specific Base-Letter Conversions
15.1.3 New German Spelling
15.2 Overriding Alternative Spelling Features
15.2.1 Overriding Base-Letter Transformations with Alternate Spelling
15.3 Alternative Spelling Conventions
15.3.1 German Alternate Spelling Conventions
15.3.2 Danish Alternate Spelling Conventions
15.3.3 Swedish Alternate Spelling Conventions

A Oracle Text Result Tables

A.1 CTX_QUERY Result Tables
A.1.1 EXPLAIN Table
A.1.1.1 Operation Column Values
A.1.1.2 OPTIONS Column Values
A.1.2 HFEEDBACK Table
A.1.2.1 Operation Column Values
A.1.2.2 OPTIONS Column Values
A.1.2.3 CTX_FEEDBACK_TYPE
A.2 CTX_DOC Result Tables
A.2.1 Filter Table
A.2.2 Gist Table
A.2.3 Highlight Table
A.2.4 Markup Table
A.2.5 Theme Table
A.2.6 Token Table
A.3 CTX_THES Result Tables and Data Types
A.3.1 EXP_TAB Table Type

B Oracle Text Supported Document Formats

B.1 About Document Filtering Technology
B.1.1 Latest Updates for Patch Releases
B.1.2 Supported Platforms
B.1.2.1 Supported Platforms
B.1.3 Environment Variables
B.1.4 Requirements for UNIX Platforms
B.2 Supported Document Formats
B.2.1 Word Processing Formats - Generic Text
B.2.2 Word Processing Formats - DOS
B.2.3 Word Processing Formats - Windows
B.2.4 Word Processing Formats - Macintosh
B.2.5 Spreadsheet Formats
B.2.6 Database Formats
B.2.7 Display Formats
B.2.8 Presentation Formats
B.2.9 Graphic Formats
B.2.10 Other Document Formats
B.3 Restrictions on Format Support

C Text Loading Examples for Oracle Text

C.1 SQL INSERT Example
C.2 SQL*Loader Example
C.2.1 Creating the Table
C.2.2 Issuing the SQL*Loader Command
C.2.2.1 Example Control File: loader1.dat
C.2.2.2 Example Data File: loader2.dat
C.3 Structure of ctxload Thesaurus Import File
C.3.1 Alternate Hierarchy Structure
C.3.2 Usage Notes for Terms in Import Files
C.3.3 Usage Notes for Relationships in Import Files
C.3.4 Examples of Import Files
C.3.4.1 Example 1 (Flat Structure)
C.3.4.2 Example 2 (Hierarchical)
C.3.4.3 Example 3

D Oracle Text Multilingual Features

D.1 Introduction
D.2 Indexing
D.2.1 Index Types
D.2.1.1 CONTEXT Index Type
D.2.1.2 CTXCAT Index Type
D.2.1.3 CTXRULE Index Type
D.2.2 Lexer Types
D.2.3 Basic Lexer Features
D.2.3.1 Theme Indexing
D.2.3.2 Alternate Spelling
D.2.3.3 Base Letter Conversion
D.2.3.4 Composite
D.2.3.5 Index stems
D.2.4 Multi Lexer Features
D.2.5 World Lexer Features
D.3 Querying
D.3.1 ABOUT Operator
D.3.2 Fuzzy Operator
D.3.3 Stem Operator
D.4 Supplied Stop Lists
D.5 Knowledge Base
D.5.1 Knowledge Base Extension
D.6 Multi-Lingual Features Matrix

E Oracle Text Supplied Stoplists

E.1 English Default Stoplist
E.2 Chinese Stoplist (Traditional)
E.3 Chinese Stoplist (Simplified)
E.4 Danish (dk) Default Stoplist
E.5 Dutch (nl) Default Stoplist
E.6 Finnish (sf) Default Stoplist
E.7 French (f) Default Stoplist
E.8 German (d) Default Stoplist
E.9 Italian (i) Default Stoplist
E.10 Portuguese (pt) Default Stoplist
E.11 Spanish (e) Default Stoplist
E.12 Swedish (s) Default Stoplist

F The Oracle Text Scoring Algorithm

F.1 Scoring Algorithm for Word Queries
F.1.1 Example
F.1.2 DML and Scoring

G Oracle Text Views

G.1 CTX_CLASSES
G.2 CTX_INDEXES
G.3 CTX_INDEX_ERRORS
G.4 CTX_INDEX_OBJECTS
G.5 CTX_INDEX_PARTITIONS
G.6 CTX_INDEX_SETS
G.7 CTX_INDEX_SET_INDEXES
G.8 CTX_INDEX_SUB_LEXERS
G.9 CTX_INDEX_SUB_LEXER_VALUES
G.10 CTX_INDEX_VALUES
G.11 CTX_OBJECTS
G.12 CTX_OBJECT_ATTRIBUTES
G.13 CTX_OBJECT_ATTRIBUTE_LOV
G.14 CTX_PARAMETERS
G.15 CTX_PENDING
G.16 CTX_PREFERENCES
G.17 CTX_PREFERENCE_VALUES
G.18 CTX_SECTIONS
G.19 CTX_SECTION_GROUPS
G.20 CTX_SQES
G.21 CTX_STOPLISTS
G.22 CTX_STOPWORDS
G.23 CTX_SUB_LEXERS
G.24 CTX_THESAURI
G.25 CTX_THES_PHRASES
G.26 CTX_TRACE_VALUES
G.27 CTX_USER_INDEXES
G.28 CTX_USER_INDEX_ERRORS
G.29 CTX_USER_INDEX_OBJECTS
G.30 CTX_USER_INDEX_PARTITIONS
G.31 CTX_USER_INDEX_SETS
G.32 CTX_USER_INDEX_SET_INDEXES
G.33 CTX_USER_INDEX_SUB_LEXERS
G.34 CTX_USER_INDEX_SUB_LEXER_VALS
G.35 CTX_USER_INDEX_VALUES
G.36 CTX_USER_PENDING
G.37 CTX_USER_PREFERENCES
G.38 CTX_USER_PREFERENCE_VALUES
G.39 CTX_USER_SECTIONS
G.40 CTX_USER_SECTION_GROUPS
G.41 CTX_USER_SQES
G.42 CTX_USER_STOPLISTS
G.43 CTX_USER_STOPWORDS
G.44 CTX_USER_SUB_LEXERS
G.45 CTX_USER_THESAURI
G.46 CTX_USER_THES_PHRASES
G.47 CTX_VERSION

H Stopword Transformations in Oracle Text

H.1 Understanding Stopword Transformations
H.1.1 Word Transformations
H.1.2 AND Transformations
H.1.3 OR Transformations
H.1.4 ACCUMulate Transformations
H.1.5 MINUS Transformations
H.1.6 NOT Transformations
H.1.7 EQUIValence Transformations
H.1.8 NEAR Transformations
H.1.9 Weight Transformations
H.1.10 Threshold Transformations
H.1.11 WITHIN Transformations

Index