In this lab you will learn about using weka using the database iris.arff you find it in the folder: ~ / weka-3-6/data/Iris.arff.
This is a basic donnnées, very famous, with 150 examples of flowers described by four attributes of continuing value and belonging to three classes.
1>Open firstly file Iris.arff with a text editor to discover the format ARFF (Attribute Relation File Format) :
2>Click on the button "O". Choose the data file. / Data / Iris.arff:
Some information will appear in the window
In the pane "SelectedAttribute", you can get basic statistics for the selected attribute: Name, Type, Missing, Unique, distinct, Vlaur Min / Max.
Select SepalLength Attribut:
Select SepalWidth ATTribut :
Select PetalLength :
Select PetalWidth :
So we have :
>>>Blue Color => Iris-Setosa
>>>Red Color => Iris-versicolor
>>>Last Color => Iris-virginica
-Click on Visualize All :
==>We Note that the overlap(chevauchement(fr)) between classes is minimal when we have classification based on attribute petal (Either petalLength or petalWidth).
-Weka offers the possibility to operate by pretreatment by applying a filter on attributes:
- Click on Choose Button >supervised>attribute>Discretize :
==>This filter allows to discretize continuous values
2.Data visualization :
For a first approach to classification, go in the "Visualize" Tab. You will see a set of 25 graphics:
1. Change the axes to achieve a classification that gives off the lower of decision rules
-When X-> PetalWidth && Y -> SepalLength , We give off a minimal number of Decisions Rules :
Decisions Rules (Règles de décision(fr)) :
-if(X>=0.1) AND(X<=0.6) Then Class<-- Iris-Setosa.
-if(X>=1) AND(X<=1.7) AND(Y<=5.6) Then Class<-- Iris-Versicolor.
-if(X>=1.8) AND(X<=2.5) Then Class<-- Iris-virginica.
-if(Y>=7.1) Then Class<-- Iris-virginica.
=>This is the best classification in comparison with other classifications. In effect, the overlap between classes is minimal