Oracle Data Mining Application Developer's Guide 10g Release 1 (10.1) Part Number B10699-01 |
|
|
View PDF |
This appendix provides a detailed example of binning.
Table A-1 displays original data before binning. Table A-2 shows the bin boundaries for numeric data; Table A-3 shows bin boundaries for categorical data. Table A-4 shows the results of binning.
Table A-1 Binning Illustration: Data before Binning Table A-2 Binning Illustration: Bin Boundaries for Numeric DataCOLUMN_NAME | LOWER_ BOUNDARY | UPPER_BOUNDARY | BIN_ID | DISPLAY_NAME |
---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
PERSON_ID | AGE | WORK CLASS |
WEIGHT | EDUCATION | MARITAL_STATUS | OCCUPATION |
---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The Java interface supports automated binning. An important advantage of automated binning is that it allows ODM to handle raw data. Automated binning also allows initial exploration of problems about which there is little or no information to guide binning decisions.
Currently automatic binning requires closed intervals for numerical bins. This can result in certain values being ignored. For example, if the salary range in the build data table is 0 to 1,000,000, any salary greater than 1,000,000 is ignored when the model is applied. If you are trying to identify likely purchasers of a high-end consumer product, attributes indicating the wealthiest individuals are likely to be deleted, and you probably won't find the best targets. Manual binning has the option of making extreme bins open-ended, that is, with infinite boundaries.