Classification for ML and AI

  • A classifier is a method for determining the most likely class of an unknown object or event based on a number of instances of each of the classes (AKA the training set).

 

  • There are two main ways to determine this, either by prior knowledge of what the output values for our samples should be(Supervised Learning), or by infering the intrinsic structure present within a dataset(Unsupervised Learning).

 

  • The first step in classification process is: feature (integer/real/categorical ) extraction. Which is where each instance in the training set is expressed as a vector of measurements.

 

  • The space spanned by all possible combinations of features is referred to as the feature space.

 

  • Once features have been extracted there are now three most common possible cases for an output:

 

First two outputs use supervised learning.

1. Classification(categorical responses): in the first case, the actual class of each instance in the training set is made available to the classifier and the output will be one of these classes(castegories).

Sometimes an instance has been given an incorrect class, and this is called a labelling error.

2. Regression(continuous responses): in the second case, the training set contains continuous data attributes values that will be used to make correlations amongst themselves. the output will calculated by one of these correlations.

 

Third output uses Unsupervised learning.

3. Clustering (grouped responses): in the third case the information is not available(data is unlabeled  in the training set) and  there is no straightforward way to evaluate the accuracy of the output, thus the classifier method tries to find patterns or intrinsic attributes in the training set and provide the output as a clustering of one of these findings.

 

Last, now that the training set has been established into the classifier, we can insert our input data and receive the  predicted outcome by either method as explained above.