Oracle® OLAP DML Reference 10g Release 1 (10.1) Part Number B10339-02 |
|
|
View PDF |
The ROLLUP command calculates totals for a hierarchy of values where each level of the hierarchy is an aggregation of the values in the level below it.
The ROLLUP command only performs simple sum aggregation. Additionally, it only aggregates data when the members of the hierarchy are contained in a single rollup or "embedded-total" dimension, so called because it contains both a detail (lowest) level and levels that are aggregations of lower levels. A relation between the embedded-total dimension and itself, called a "parent relation," specifies the arrangement of the hierarchy. For each value of the dimension, the parent relation contains the value that is immediately above it in the hierarchy (that is, its "parent" value).
Before using ROLLUP, make sure that the data variable that is dimensioned by the embedded-total dimension has data for the lowest-level values in the hierarchy. ROLLUP uses the data at the lowest level to calculate the totals for the higher levels.
Note: Most applications aggregate data using an aggmap object rather than using the ROLLUP command. Aggmap objects allow you to write complex aggregation specifications. See "Aggregations" for more information. |
Syntax
ROLLUP data [OVER embed-tot-dim] [USING parent-rel] [ZEROFILL]
Arguments
A numeric variable whose values are to be rolled up. When the variable has more than one dimension, one of them must be the OVER dimension.
A dimension of data whose values form a hierarchy. When data has only one dimension, then that dimension is the OVER dimension by default and you can omit the OVER phrase.
A relation between the OVER dimension and itself, called a parent relation, that specifies a hierarchy among the dimension values. For each dimension value, this parent relation specifies another value of the dimension which is its immediate parent. The parent relation holds NAs for the values at the highest level of the hierarchy. When there is more than one relation between the OVER dimension and itself, then you must specify the relation you want to use as the parent relation.
Specifies that parent totals should be set to zero when all of their child values are NA
. When you do not specify ZEROFILL, ROLLUP sets parent totals to NA
when all of their child values are NA
.
Notes
In the hierarchy specified by the parent relation, you can think of the lowest level as the "child" values, all the other values as "parents," and each level as a "generation." The relation specifies the parent at the next higher level for each dimension value. The following example shows the values of an embedded-total rollup dimension called area
, that has three levels, and the values of the child-parent relation area.area
.
AREA AREA.AREA ---------- ---------- Level 1 -> Totalus NA 2 -> East Totalus 3 -> Boston East 3 -> Newyork East 2 -> South Totalus 3 -> Atlanta South
A hierarchy can consist of several trees, so that there is more than one value at level 1. The value of the relation will be NA
for all level-1 values, because these values have no parent in the hierarchy.
ROLLUP always looks to the lowest level of a hierarchy to calculate results. It rolls up only from the child values that are in the status list for the rollup dimension, but it still rolls up through all the levels in the hierarchy.
For example, suppose you use the area
dimension and the area.area
child-parent relation described in "Generation Levels in a Parent Relation", and you change the data value for NewYork
. When you then roll up with only the child values for East
in the status list (Boston
and NewYork), the rollup occurs without including the child value for South
(Atlanta
), but still includes level 2 as it goes from level 3 to level 1 (TotalUS
). When you want all the child values included in rolling up to TotalUS
, you must explicitly include all of them in the status list. In the example, you would limit area
and add Atlanta
to the status list.
When the data has changed for some, but not all, of the child values in the embedded-total rollup dimension, you can set the status to roll up just the values that have changed. For example, assume your embedded-total dimension is called d2
, and its parent relation is called reld2
, first limit d2
to the values that have changed. Then use the following statements to add the appropriate additional values to the status list.
LIMIT d2 ADD ANCESTORS USING reld2 LIMIT d2 ADD CHILDREN USING reld2
When the data variable being rolled up has more than one dimension, then the dimensions other than the rollup dimension are treated "normally." ROLLUP loops over their status and repeats the aggregation for each of their values.
ROLLUP automatically distinguishes between generations in the parent relation, even to the extent of allowing generation-skipping hierarchies. For example, you can have a four-level hierarchy (for example, neighborhoods, cities, states, and total U.S.) that has a three-level branch (for example, Boston, Massachusetts, and total U.S.).
Because ROLLUP automatically distinguishes parent values from child values, you can have all the values of the embedded-total rollup dimension in the status list when you execute ROLLUP.
When a variable includes a composite in its dimension list, you cannot roll up the data over the composite. However, you can roll up data over a base dimension of the composite. You specify which base dimension to loop over with the OVER keyword. The parent relation must be a relation between the base dimension and itself. When the composite is missing a value that is required for the rollup, ROLLUP will create the missing value.
When your data is sparse, it is usually better to define a variable with a composite than with a conjoint dimension. However, in some situations you might find it advantageous to convert the composite to a conjoint dimension before you execute a ROLLUP command. When a variable is dimensioned by a conjoint dimension, you can define a parent relation between the conjoint dimension and itself and roll up the data over the conjoint dimension.
To convert a composite to a conjoint dimension, use the CHGDFN command.
When a variable is dimensioned by a conjoint dimension, you can roll up the data over the conjoint dimension, but you cannot roll up over a base dimension of the conjoint. The parent relation must be a relation between the conjoint dimension and itself.
You can have a multidimensional parent relation that defines more than one hierarchy, so that child values contribute to more than one higher-level total. However, at each level, the hierarchies should each point to a separate higher-level total, so that the data is not counted more than once at the higher level. See Chapter 21, "Using a Multidimensional Relation".
Note: A multidimensional parent relation cannot share any dimensions with the data variable other than the embedded-total dimension. |
The parent relation must not create a circular hierarchy. That is, the relation must not contain any dimension values that are their own parent, either directly or indirectly. A parent relation that creates a circular hierarchy would put the calculations of ROLLUP into an infinite loop. In your application, you should ensure that your hierarchies are not circular. To do so, use the HIERCHECK program to check every parent relation in your analytic workspace for circularity. You can use HIERCHECK either as a command or as a function.
When you feel that the ROLLUP command is taking longer than expected, consider the following strategies:
When your data is sparse, define the data using a composite. Use only one composite for each variable, and be sure to make the composite the slowest-varying dimension (the last dimension in the list). It is a good idea to keep the first dimension in the list dense, and put all the other dimensions into a single composite.
Use the CHGDFN command with the SEGWIDTH keyword to specify the size of a variable's segments.
Check the parent relation to make sure it does not define a circular hierarchy. See "Avoiding Circular Hierarchies".
Use the LIMIT command to limit the amount of data being rolled up at one time. You can design a program that explicitly loops through a series of subsets of the desired data.
Set the dimension status to selectively roll up the data. For example, when only some values of a variable have changed, you only need to roll up the data over the ancestors of those values. See "Rolling Up from Changed Child Values".
Execute an UPDATE command after every ROLLUP command.
Examples
Example 21-7 Rolling up Sales Data
This example illustrates the use of ROLLUP. You can create an embedded-total geography
dimension by combining the values in the district
and region
dimensions. Another value, TotalUS
, is the parent of the regions. The order of the dimension values does not matter because the parent relation (that you define later) provides the parent-child information. After the following statements are executed.
DEFINE geography DIMENSION TEXT MAINTAIN geography ADD 'TotalUS' VALUES(region) - VALUES(district) REPORT geography
The following report is created.
GEOGRAPHY ----------- TotalUS East Central West Boston Atlanta Chicago Dallas Denver Seattle
Next, create the child-parent relation, geog.geog
, which is the relation between geography
and itself.
DEFINE geog.geog RELATION geography <geography>
Each value of the geog.geog
relation should be the parent of the corresponding geography
value. You can add the values that are shown in the following report.
GEOGRAPHY GEOG.GEOG -------------- ---------- TotalUS NA East TotalUS Central TotalUS West TotalUS Boston East Atlanta East Chicago Central Dallas Central Denver West Seattle West
Finally, you can define a variable, g.units
, for the data that is currently held in the units
variable plus the totals for the higher levels. After limiting geography
to the values of district
, you can transfer the units
data to g.units
and use ROLLUP to fill in the totals.
DEFINE g.units INTEGER <month product geography> LIMIT geography TO VALUES(district NOSTATUS) g.units = UNRAVEL(units) ROLLUP g.units OVER geography USING geog.geog LIMIT geography TO ALL LIMIT product TO 'Tents' LIMIT month TO 'Jan95' TO 'Jul95' REPORT W 9 DOWN geography W 9 geog.geog ACROSS month: W 5 - g.units
The preceding statements produce the following output.
PRODUCT: TENTS ------------------------G.UNITS---------------------- -------------------------MONTH------------------------ GEOGRAPHY GEOG.GEOG Jan95 Feb95 Mar95 Apr95 May95 Jun95 Jul95 --------- --------- ----- ----- ----- ----- ----- ----- ----- TotUS NA 1,429 1,440 1,860 2,534 3,378 3,779 4,058 East TotUS 453 479 589 848 1,092 1,248 1,315 Central TotUS 478 494 666 848 1,137 1,247 1,360 West TotUS 498 467 605 838 1,149 1,284 1,383 Boston East 200 203 269 359 507 556 545 Atlanta East 253 276 320 489 585 692 770 Chicago Central 181 181 247 304 416 443 461 Dallas Central 297 313 419 544 721 804 899 Denver West 227 210 283 358 497 573 642 Seattle West 271 257 322 480 652 711 741
Example 21-8 Using a Multidimensional Relation
In this example, we have defined a new dimension called area
that includes the values in the geography
dimension that was created in "Rolling up Sales Data". In addition, area
includes European and Asian regions and countries that roll up into these regions.
There is also a multidimensional parent relation named area.area
that defines two hierarchies. The relation area.area
has area
for one of its dimensions, while its other dimension, hier
, holds a list of hierarchies. One of these hierarchies, Nation
, specifies continental and global totals. The second hierarchy, Corporate
, divides the child values into divisions and groups of divisions. When an area
value is not part of a hierarchy, area.areahier
has an NA
value for that area. The area.area
relation also has an NA
for the top-level area in each hierarchy, since the top level has no parent value. Executing a DESCRIBE area HIER a.ah
statement
shows the following definitions that have been created in the analytic workspace.
DEFINE area DIMENSION TEXT DEFINE hier DIMENSION TEXT DEFINE area.area RELATION area <area hier>
Assume that a REPORT DOWN area W 20 area.area
statement executes.
The resulting report shows the values of the multidimensional parent relation area.area
.
---------------AREA.AREA----------------- ------------------HIER------------------- AREA NATION CORPORATE -------------- -------------------- -------------------- Global NA NA GroupI NA Global GroupII NA Global DivI NA GroupI DivII NA GroupI DivIII NA GroupII TotalUS Global NA TotInternation Global NA TotalEurope TotInternation NA Germany TotalEurope Divii England TotalEurope DivI Spain TotalEurope Diviii France TotalEurope Diviii TotalAsia TotInternation NA India TotalAsia DivI Malaysia TotalAsia Diviii East TotalUS NA Central TotalUS NA West TotalUS NA Boston East DivI Atlanta East DivI Chicago Central DivI Dallas Central DivI Denver West DivI Seattle West DivI
The analytic workspace also contains a variable named a.units
that has area
as one of its dimensions. Its definition is as follows.
DEFINE a.units INTEGER <month product area>
After data has been loaded into a.units
for the lowest level areas (the districts in the United States and the countries of Europe and Asia), you can execute the following ROLLUP command to roll up the data and fill in the totals in the a.units
variable. The command rolls up data over the area
dimension, using the multidimensional parent relation area.area
. This aggregates data both in the Nation
hierarchy and in the Corporate
hierarchy.
ROLLUP a.units OVER area USING area.area
When you use the following statements to produce a report of the a.units
data, the data for each of the two hierarchies (Nation
and Corporate
) will be shown on separate pages of the report.
FOR hier DO LIMIT area TO a.ah NE NA LIMIT area ADD 'Global' REPORT DOWN area W 14 area.area ACROSS month: W 7 a.units DOEND