The cross-entropy measure is the sum of all the natural logs of each data point’s probability in accordance to the given or calculated labels divided by the number of points times (-1){so it can be positive number}.
It’s simple formula given N points:
(When y(n) is 1 the first term of the formula is calculated, when y(n) is is 0 the second term of the formula is calculated)
Cross-entropy can be used as an error measure when a network’s outputs can be thought of as representing the probability distribution that each hypothesis might be true.
The error measure (cross-entropy) indicates the distance between what the network believes this distribution should be, and what the actual value says it should be.
The cross-entropy measure has been used as an alternative to squared error(SE)/mean squared error(MSE), in order to determine which model better describes our labeled data points.
It differs from the MSE/SE by which it tends to allow errors to change weights even when nodes derivatives are asymptotically close to 0.
more info:
Entropy is the amount of information in a transmitted message, Shanon entropy is expressed as:
Cross-entropy formula:
The Kullback–Leibler divergence is the difference between the cross entropy and the entropy.