Grouping

This option enables developing/modifying the Grouping of the values of fields.

For discrete fields

You start by using the (+) button to add a new group into the topmost Groups list. Then select any of the Unused values listed and use the arrow keys to move the value across into the Included values list.

You can move values between groups and back to the Unused values remainder pool at any time, as well as delete a whole group (when all the values will then be moved back into the Unused values list.

For numeric (or date/time) fields

You can create numeric ranges by typing in the threshold value where you want to split, then press the Add button. Once you have defined a range, you can change it by selecting the range - type in the new value in the thresholds box - then press Update to make the change.

Monotonic field: If Monotonic is selected (along with Use grouping) then the induction process will only create branches that preserve the order in which the groups are shown in the group name list.

It forces the branches of the tree to combine groups only by selecting a split point in the ordered list of groups. For example, age groups 18-21, 21-30, and 30+ might be groups 1, 2 and 3. Monotonic would allow a two way split as (1,2) and (3) but not as (1,3) and (2).

Uniform Grouping: This button can be used on numeric grouping to automatically segment the field into N equal ranges, which are divided, by using the calculation of the average plus or minus two times the standard deviation. Based on this same calculation, using the Separate Extreme Values check box will segment values at either extreme into additional groups, which may therefore produce one or more groups in addition to the number N specified.

Note that the ranges are produced by dividing the value of the field and not by dividing the frequencies of the values.

Use grouping

This check box MUST be selected to apply your grouping during mining. When the Use grouping check box is deselected, the definition of the groups for the attribute will be retained but will not be used when mining the tree.

You must ensure that your use of grouping does not result in leaving any 'unused' discrete values or numeric ranges. The OK button will be disabled when this is the case. You must correct the situation before you can close the dialog with your changes.