Preprocessing – Classifying a Supervised Image Dataset

Get the supervised dataset (containing the image features and their labels).
Validate the data – by size , by shape, by error data , by premature EOF, by Null type, by object type, by out of range of enumerated values, by business logic.
Visualize the data by plotting it.
Create a mapping function for each image that will send it to the following preprocessing functions
1. Convert the image to grayscale(you can use opencv with cv2.COLOR_BGR2GRAY).
2. Equalize the data in the image so that color intensities are spread more equally(from 0 to 255, improving the contrast) across the image(you can use cv2.equalizeHist).
3. Normalize the image.
Split the data to training , validation and test(it’s very important that the training set will have instances of features related to all possible classes otherwise the ANN will not be able to train for them).
Visualize a random preprocessed image by plotting it.
Reshape the images of each splitted dataset and add 1 depth to their end for ANN processing.
Use Onehot technique for each of the splitted dataset labels’ classes( you can use keras.utils.to_categorical(label_set,number_of_classes_in_label_set).

יאיר שנער