 # Outline. Conceptualization of diversity with unbalanced hierarchies.

## Presentation on theme: "Outline. Conceptualization of diversity with unbalanced hierarchies."— Presentation transcript:

Outline

Conceptualization of diversity with unbalanced hierarchies

Lower-bound balanced Frequent Pattern: Consider an unbalanced frequent pattern Y = {i1, i2 · · · in} with n items, a concept hierarchy of height h and h(i1), h(i2) · · · h(in) be the heights of the corresponding items. Let h(ij) be the least value among all the heights of items. A lower-bound balanced frequent pattern is defined when all the items in Y are brought down to same level h(ij). Construction: Given an unbalanced frequent pattern, first calculate the height of the item which lies highest in the concept hierarchy. Let this height be h(l). Remove all the extra edges below the level h(l). For the items which are below level h(l), we replace them by their corresponding parents at level h(l) (with duplicates removed).

Conceptualization of diversity with unbalanced hierarchies Upper-bound Balanced Frequent Pattern: Consider an unbalanced frequent pattern Y = {i1, i2 · · · in} with n items, a concept hierarchy of height h and h(i1), h(i2) · · · h(in) be the heights of the corresponding items. Let h(ij) be the highest value among all the heights of items. A upper-bound balanced frequent pattern is defined when all the items in Y are brought down to same level h(ij). Construction: First calculate the height of the item which lies highest and deepest in the concept hierarchy. Let these height be h(il) and h(id) respectively. Starting from the level h(il),we keep on adding one dummy edge for all the imbalanced nodes till we reach the level h(id).

Conceptualization of diversity with unbalanced hierarchies Figure : Unbalanced and corresponding Lower-Bound and Upper-Bound Balanced Frequent Patterns. The set {i1, i2} in (b) denotes the parent of items i1 and i2 at level 1.

Outline

Computing diversity

Generalization for certain items missing. Diversity of a balanced frequent pattern depends on : Merging Factor Level Factor. Contribution of MF at every level as some items are already at a generalized level. Notion of Adjustment Factor (AF) to calculate the contribution of MF at each level.

Example of Adjustment Factor (AF) Y = {whole milk, pepsi, coke, shampoo} Y’= {whole milk, node2, node3, node4} Height of the hierarchy = 5 Height of the item which lies deepest = 5 Number of edges in Y at level 4. |EUFP(Y, 4)| = 1 Number of edges in upper-bound balanced pattern of Y = 4 |EUBFP(Y, 4)| = 4. AF(Y,4) = ¼ = 0.25 Similarly, AF(Y,3) = ¾ = 0.75 root Drink Beauty milk Soft Drink hair Original Cola shampoo fat coke pepsi node1 whole milk node2 node3 node4

DiverseRank(DRB)

Outline

Algorithm 14

Outline

Experimental Analysis 16

Generating Concept hierarchy 17

18 Number of diverse-frequent patterns VS minDiv  With the increase in minDiv, the number of diverse-frequent patterns has decreased irrespective of minSup threshold.  This is because of the fact that the items in the several frequent patterns belong to one or few categories.

Top 10 frequent patterns w.r.t to support 19  Extracted top 10 frequent patterns of size 3 w.r.t. support.  Highest support count for a pattern is 2.3(%).

Top 10 diverse-frequent patterns 20  Extracted top 10 diverse-frequent patterns of size 3.  The highest DiverseRank value is 1.

Experimental Observations 21

Experimental Results 22

Height of the simulated concept hierarchy : 14. Distribution of items in the simulated concept hierarchy is as follows: Experimental Results 23

Experimental Results 24

Outline

Related Work 26

Concept Hierarchies in Data Mining 27

Related Work 28

Outline

Conclusions and Future Work 30

References

Similar presentations