1
Introduction
Oracle Data Mining embeds data mining in the Oracle database. The data never leaves the database -- the data, data preparation, model building, and model scoring activities all remain in the database. This enables Oracle to provide an infrastructure for data analysts and application developers to integrate data mining seamlessly with database applications.
Oracle Data Mining is designed for programmers, systems analysts, project managers, and others interested in developing database applications that use data mining to discover hidden patterns and use that knowledge to make predictions.
There are two interfaces: a Java API and a PL/SQL API. The Java API assumes a working knowledge of Java, and the PL/SQL API assumes a working knowledge of PL/SQL. Both interfaces assume a working knowledge of application programming and familiarity with SQL to access information in relational database systems.
This document describes using the Java and PL/SQL interface to write application programs that use data mining. It is organized as follows:
- Chapter 1 introduces ODM.
- Chapter 2 and Chapter 3 describe the Java interface. Chapter 2 provides an overview; Chapter 3 provides details. Reference information for methods and classes is available with Javadoc. The demo Java programs are described in Table 3-1. The demo programs are available as part of the installation; see the README file for details.
- Chapter 4 and Chapter 5 describe the PL/SQL interface. Basics are described inChapter 4, and demo PL/SQL programs are described in Chapter 5.
- Reference information for the PL/SQL functions and procedures is included in the PL/SQL Packages and Types Reference. The demo programs themselves are available as part of the installation; see the README file for details.
- Chapter 6 describes programming with BLAST, a set of table functions for performing sequence matching searches against nucleotide and amino acid sequence data stored in an Oracle database.
- Chapter 7 describes how to use the PL/SQL interface to do text mining.
- Appendix A contains an example of binning.
- Appendix B provides tips and techniques useful in both the Java and the PL/SQL interface.
1.1 ODM Requirements and Constraints
Anyone writing an Oracle Data Mining program must observe the following requirements and constraints:
- Attribute Names in ODM: All attribute names in ODM are case-sensitive and limited to 30 bytes in length; that is, attribute names may be quoted strings that contain mixed-case characters and/or special characters. Simply put, attribute names used by ODM follow the same naming conventions and restrictions as column names or type attribute names in Oracle.
- Mining Object Names in ODM: All mining object names in ODM are 25 or fewer bytes in length and must be uppercase only. Model names may contain the underscore ("_") but no other special characters. Certain prefixes are reserved by ODM (see below) and should not be used in mining object names.
- ODM Reserved Prefixes: The prefixes
DM$
and DM_
are reserved for use by ODM across all schema object names in a given Oracle instance.
Users must not directly access these ODM internal tables, that is, they should not execute any DDL, Query, or DML statements directly against objects named with these prefixes. Oracle recommends that you rename any existing objects in the database with these prefixes to avoid confusion in your application data management.
- Input Data for Programs Using ODM: All input data for ODM programs must be presented to ODM as an Oracle-recognized table, whether a view, table, or table function output.