Download presentation
Presentation is loading. Please wait.
Published byDarren Sims Modified over 6 years ago
1
Data Mining Workbenches: a overview &comparison focusing on open-source packages
CS240A notes by C. Zaniolo
2
Most Popular Data Mining Software
Rexer Analytics Survey (Early 2007) asked about the tools used often and occasionally. Clearly more popular than the rest were: SPSS or SPSS Clementine "Own Code" SAS or SAS Enterprise Miner Followed by R Weka C4.5 / C5.0
3
Critical Mass and Popularity
Top ten most used packages by KDD Nuggets Survey (May 2007): SPSS/ SPSS Clementine Salford Systems CART/MARS/TreeNet/RF Yale (now Rapid Miner) SAS / SAS Enterprise Miner Angoss Knowledge Studio / Knowledge Seeker KXEN Weka R Microsoft SQL Server? MATLAB? Note: Microsoft Excel omitted as it's not really "data mining" software, and I've merged the tools offered by a single vendor (SPSS and SAS) You can see the full survey results
4
Comments Gregory Piatetsky-Shapiro, KDnuggets Editor:
Votes from tool vendors were removed.. Comparing with 2008 KDnuggets Poll on data mining tools/software used, the big changes are growth in SPSS, RapidMiner, and R.
5
Popular Data Mining Software (cont.)
Rexer Analytics Survey is taken every year and the summary report can be obtained free. 2009 SURVEY HIGHLIGHTS: Open-source tools Weka and R made substantial movement up data miner’s tool rankings this year, and are now used by large numbers of both academic and for-profit data miners. SAS Enterprise Miner dropped in data miner’s tool rankings 2010 SURVEY HIGHLIGHTS: R: After a steady rise across the past few years, R overtook other tools to become the tool used by more data miners (43%) STATISTICA has also been climbing in the rankings. STATISTICA, IBM SPSS Modeler, and R received the strongest satisfaction ratings in both 2010 and 2009.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.