Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Learning with WEKA

Similar presentations


Presentation on theme: "Machine Learning with WEKA"— Presentation transcript:

1 Machine Learning with WEKA

2 Copyright: Martin Kramer (mkramer@wxs.nl)
WEKA: the bird Copyright: Martin Kramer 11/18/2018 University of Waikato

3 WEKA: the software Machine learning/data mining software written in Java (distributed under the GNU Public License) Used for research, education, and applications Complements “Data Mining” by Witten & Frank Main features: Comprehensive set of data pre-processing tools, learning algorithms and evaluation methods Graphical user interfaces (incl. data visualization) Environment for comparing learning algorithms 11/18/2018 University of Waikato

4 WEKA: versions There are several versions of WEKA:
WEKA 3.0: “book version” compatible with description in data mining book WEKA 3.2: “GUI version” adds graphical user interfaces (book version is command-line only) WEKA 3.3: “development version” with lots of improvements This talk is based on the latest snapshot of WEKA 3.3 (soon to be WEKA 3.4) 11/18/2018 University of Waikato

5 WEKA only deals with “flat” files
@relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... Flat file in ARFF format 11/18/2018 University of Waikato

6 WEKA only deals with “flat” files
@relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ... numeric attribute nominal attribute 11/18/2018 University of Waikato

7 11/18/2018 University of Waikato

8 11/18/2018 University of Waikato

9 11/18/2018 University of Waikato

10 Explorer: pre-processing the data
Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools in WEKA are called “filters” WEKA contains filters for: Discretization, normalization, resampling, attribute selection, transforming and combining attributes, … 11/18/2018 University of Waikato

11 11/18/2018 University of Waikato

12 11/18/2018 University of Waikato

13 11/18/2018 University of Waikato

14 11/18/2018 University of Waikato

15 11/18/2018 University of Waikato

16 11/18/2018 University of Waikato

17 11/18/2018 University of Waikato

18 11/18/2018 University of Waikato

19 11/18/2018 University of Waikato

20 11/18/2018 University of Waikato

21 11/18/2018 University of Waikato

22 11/18/2018 University of Waikato

23 11/18/2018 University of Waikato

24 11/18/2018 University of Waikato

25 11/18/2018 University of Waikato

26 11/18/2018 University of Waikato

27 11/18/2018 University of Waikato

28 11/18/2018 University of Waikato

29 11/18/2018 University of Waikato

30 11/18/2018 University of Waikato

31 11/18/2018 University of Waikato

32 Explorer: building “classifiers”
Classifiers in WEKA are models for predicting nominal or numeric quantities Implemented learning schemes include: Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … “Meta”-classifiers include: Bagging, boosting, stacking, error-correcting output codes, locally weighted learning, … 11/18/2018 University of Waikato

33 11/18/2018 University of Waikato

34 11/18/2018 University of Waikato

35 11/18/2018 University of Waikato

36 11/18/2018 University of Waikato

37 11/18/2018 University of Waikato

38 11/18/2018 University of Waikato

39 11/18/2018 University of Waikato

40 11/18/2018 University of Waikato

41 11/18/2018 University of Waikato

42 11/18/2018 University of Waikato

43 11/18/2018 University of Waikato

44 11/18/2018 University of Waikato

45 11/18/2018 University of Waikato

46 11/18/2018 University of Waikato

47 11/18/2018 University of Waikato

48 11/18/2018 University of Waikato

49 11/18/2018 University of Waikato

50 11/18/2018 University of Waikato

51 11/18/2018 University of Waikato

52 11/18/2018 University of Waikato

53 11/18/2018 University of Waikato

54 11/18/2018 University of Waikato

55 11/18/2018 University of Waikato

56 11/18/2018 University of Waikato

57 11/18/2018 University of Waikato

58 11/18/2018 University of Waikato

59 11/18/2018 University of Waikato

60 11/18/2018 University of Waikato

61 11/18/2018 University of Waikato

62 11/18/2018 University of Waikato

63 11/18/2018 University of Waikato

64 11/18/2018 University of Waikato

65 11/18/2018 University of Waikato

66 11/18/2018 University of Waikato

67 11/18/2018 University of Waikato

68 11/18/2018 University of Waikato

69 11/18/2018 University of Waikato

70 11/18/2018 University of Waikato

71 11/18/2018 University of Waikato

72 11/18/2018 University of Waikato

73 11/18/2018 University of Waikato

74 11/18/2018 University of Waikato

75 11/18/2018 University of Waikato

76 11/18/2018 University of Waikato

77 11/18/2018 University of Waikato

78 11/18/2018 University of Waikato

79 11/18/2018 University of Waikato

80 11/18/2018 University of Waikato

81 11/18/2018 University of Waikato

82 11/18/2018 University of Waikato

83 11/18/2018 University of Waikato

84 11/18/2018 University of Waikato

85 11/18/2018 University of Waikato

86 11/18/2018 University of Waikato

87 11/18/2018 University of Waikato

88 11/18/2018 University of Waikato

89 11/18/2018 University of Waikato

90 11/18/2018 University of Waikato

91 11/18/2018 University of Waikato

92 Explorer: clustering data
WEKA contains “clusterers” for finding groups of similar instances in a dataset Implemented schemes are: k-Means, EM, Cobweb, X-means, FarthestFirst Clusters can be visualized and compared to “true” clusters (if given) Evaluation based on loglikelihood if clustering scheme produces a probability distribution 11/18/2018 University of Waikato

93 11/18/2018 University of Waikato

94 11/18/2018 University of Waikato

95 11/18/2018 University of Waikato

96 11/18/2018 University of Waikato

97 11/18/2018 University of Waikato

98 11/18/2018 University of Waikato

99 11/18/2018 University of Waikato

100 11/18/2018 University of Waikato

101 11/18/2018 University of Waikato

102 11/18/2018 University of Waikato

103 11/18/2018 University of Waikato

104 11/18/2018 University of Waikato

105 11/18/2018 University of Waikato

106 11/18/2018 University of Waikato

107 11/18/2018 University of Waikato

108 Explorer: finding associations
WEKA contains an implementation of the Apriori algorithm for learning association rules Works only with discrete data Can identify statistical dependencies between groups of attributes: milk, butter  bread, eggs (with confidence 0.9 and support 2000) Apriori can compute all rules that have a given minimum support and exceed a given confidence 11/18/2018 University of Waikato

109 11/18/2018 University of Waikato

110 11/18/2018 University of Waikato

111 11/18/2018 University of Waikato

112 11/18/2018 University of Waikato

113 11/18/2018 University of Waikato

114 11/18/2018 University of Waikato

115 11/18/2018 University of Waikato

116 Explorer: attribute selection
Panel that can be used to investigate which (subsets of) attributes are the most predictive ones Attribute selection methods contain two parts: A search method: best-first, forward selection, random, exhaustive, genetic algorithm, ranking An evaluation method: correlation-based, wrapper, information gain, chi-squared, … Very flexible: WEKA allows (almost) arbitrary combinations of these two 11/18/2018 University of Waikato

117 11/18/2018 University of Waikato

118 11/18/2018 University of Waikato

119 11/18/2018 University of Waikato

120 11/18/2018 University of Waikato

121 11/18/2018 University of Waikato

122 11/18/2018 University of Waikato

123 11/18/2018 University of Waikato

124 11/18/2018 University of Waikato

125 Explorer: data visualization
Visualization very useful in practice: e.g. helps to determine difficulty of the learning problem WEKA can visualize single attributes (1-d) and pairs of attributes (2-d) To do: rotating 3-d visualizations (Xgobi-style) Color-coded class values “Jitter” option to deal with nominal attributes (and to detect “hidden” data points) “Zoom-in” function 11/18/2018 University of Waikato

126 11/18/2018 University of Waikato

127 11/18/2018 University of Waikato

128 11/18/2018 University of Waikato

129 11/18/2018 University of Waikato

130 11/18/2018 University of Waikato

131 11/18/2018 University of Waikato

132 11/18/2018 University of Waikato

133 11/18/2018 University of Waikato

134 11/18/2018 University of Waikato

135 11/18/2018 University of Waikato

136 11/18/2018 University of Waikato

137 11/18/2018 University of Waikato

138 Performing experiments
Experimenter makes it easy to compare the performance of different learning schemes For classification and regression problems Results can be written into file or database Evaluation options: cross-validation, learning curve, hold-out Can also iterate over different parameter settings Significance-testing built in! 11/18/2018 University of Waikato

139 11/18/2018 University of Waikato

140 11/18/2018 University of Waikato

141 11/18/2018 University of Waikato

142 11/18/2018 University of Waikato

143 11/18/2018 University of Waikato

144 11/18/2018 University of Waikato

145 11/18/2018 University of Waikato

146 11/18/2018 University of Waikato

147 11/18/2018 University of Waikato

148 11/18/2018 University of Waikato

149 11/18/2018 University of Waikato

150 11/18/2018 University of Waikato

151 11/18/2018 University of Waikato

152 The Knowledge Flow GUI New graphical user interface for WEKA
Java-Beans-based interface for setting up and running machine learning experiments Data sources, classifiers, etc. are beans and can be connected graphically Data “flows” through components: e.g., “data source” -> “filter” -> “classifier” -> “evaluator” Layouts can be saved and loaded again later 11/18/2018 University of Waikato

153 11/18/2018 University of Waikato

154 11/18/2018 University of Waikato

155 11/18/2018 University of Waikato

156 11/18/2018 University of Waikato

157 11/18/2018 University of Waikato

158 11/18/2018 University of Waikato

159 11/18/2018 University of Waikato

160 11/18/2018 University of Waikato

161 11/18/2018 University of Waikato

162 Can continue this... 11/18/2018 University of Waikato

163 11/18/2018 University of Waikato

164 11/18/2018 University of Waikato

165 11/18/2018 University of Waikato

166 11/18/2018 University of Waikato

167 11/18/2018 University of Waikato

168 11/18/2018 University of Waikato

169 11/18/2018 University of Waikato

170 11/18/2018 University of Waikato

171 11/18/2018 University of Waikato

172 11/18/2018 University of Waikato

173 Conclusion: try it yourself!
WEKA is available at Also has a list of projects based on WEKA WEKA contributors: Abdelaziz Mahoui, Alexander K. Seewald, Ashraf M. Kibriya, Bernhard Pfahringer , Brent Martin, Peter Flach, Eibe Frank ,Gabi Schmidberger ,Ian H. Witten , J. Lindgren, Janice Boughton, Jason Wells, Len Trigg, Lucio de Souza Coelho, Malcolm Ware, Mark Hall ,Remco Bouckaert , Richard Kirkby, Shane Butler, Shane Legg, Stuart Inglis, Sylvain Roy, Tony Voyle, Xin Xu, Yong Wang, Zhihai Wang 11/18/2018 University of Waikato


Download ppt "Machine Learning with WEKA"

Similar presentations


Ads by Google