Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ant Inspired Data Mining Brandon Emerson April 22, 2013 1.

Similar presentations


Presentation on theme: "Ant Inspired Data Mining Brandon Emerson April 22, 2013 1."— Presentation transcript:

1 Ant Inspired Data Mining Brandon Emerson April 22, 2013 1

2 What is data mining? Data mining is any process that analyzes and organizes data into clear and concise formats. It can be particularly powerful when creating relationships between points of data. Mainly used by companies with a consumer focus, specifically marketing divisions. Data mining allows them to make meaningful relationships between products and consumers. 2

3 Applications in Physics Efficient data mining techniques can improve data storage and retrieval in experiments that require a great deal of data collection. Effective mining can help analysts develop relationships between specific points of data, and thus physical phenomena. 3

4 Our Goals Use basic ideas about ant behaviors to develop an effective means of data mining. Discuss recent improvements ant clustering algorithms, and compare data mining techniques by results from simple tests. 4

5 A Simple Model of Ants-1 5 Ant Object

6 A Simple Model of Ants-2 Ant Object Probability of picking up a is a constant f is the perceived fraction of objects nearby Probability of placing b is a constant Assuming the ant moves randomly and it has enough time to explore the entire area, you could expect all of the objects to be clustered together. 6

7 A Note on Perception f is the perceived fraction of objects nearby when f > 0 otherwise X y f(x) is now a measure of the similarity of object x to object y in the area around object x When the objects are the same: When the objects are different: α is a scale factor for dissimilarity. 7

8 The Basic Algorithm 15. end if 16. else 17. if (ant w/object) and (empty site) then 18. compute f(x) and probability of dropping 19. draw random real number R 20. if (R ≤ Prob) then 21. drop object 22. end if 23. end if 24. end if 25. move to randomly selected ant free adjacent site 26.end for 27. end for 28.Print location of objects 8

9 Improvements-1 Granted ants “short-term memory.” The ants stored their last x number of locations. After picking up data they proceed to their last remembered locations sequentially. Normalized the grid to enable efficient mining of a variety of data set sizes. 9 Where N is the maximum number of data items to be mined. Grid sizeStep size Number of iterations

10 Improvements-2 10 α determines the percentage of items that are similar. If α is too small, clusters wont be formed. If α is too large, the clusters will combine to create one super cluster. Each ant is uniquely assigned a value for α, and is allowed to change its value in the following way: the ant makes a set number of moves (100), during which it keeps track of how many times it has failed to drop data items F. The rate of failure is found by F/100, and α is adapted according to these parameters. If rate α  0.99 If rate α ≤ 0.99

11 The Updated Algorithm 11. move_agent to new location 12. I = carried_object 13. compute f*(x) and prob of drop 14. if drop = true then 15. while pick = false do 16. I = random_select_object 17. compute f*(x) and prob of pick 18. pick_up_object 19. end while 20. end if 21. end for 22.end 11

12 Comparing Techniques Iris 150K-meansACA Clusters3.0002.960 Rand Index0.8240.785 F-measure0.8210.773 Dunn Index2.8662.120 Variance0.8614.213 Class. Err.0.1760.230 Best results Clusters3.000 Rand Index0.8290.814 F-measure0.8300.811 Dunn Index2.9392.306 Variance0.8991.486 Class. Err.0.1670.187 12 Iris 150 is a data set used from the Machine Learning repository. K-means is a standard technique for data mining, and is used here to benchmark the Ant Clustering Algorithm’s (ACA) performance. Maximize these values Minimize this value Important note: the ACA does not need to be given the correct number of clusters to proceed; whereas K-means does.

13 Summary Ant simulation offers a unique technique for data mining. This technique was developed using simple ideas about ant behavior. Ant Clustering Algorithms could use improvement, but as it stands it is fairly effective. As our understanding of ant behavior improves, perhaps ACA could be refined into an even more efficient tool. 13

14 Just to be Clear… None of the information presented, including data tables, and code, is my personal work. All of the information was found in the paper below. Boryczka, Urszula. "Ant Colony Metaphor in a New Clustering Algorithm." Control and Cybernetics 39.2 (2010): 343-57. Print. 14


Download ppt "Ant Inspired Data Mining Brandon Emerson April 22, 2013 1."

Similar presentations


Ads by Google