Cross Entropy – ML

The cross-entropy measure is the sum of all the natural logs of each data point’s probability in accordance to the given or calculated labels divided by the number of points times (-1){so it can be positive number}.

It’s simple formula given N points:

{\displaystyle {\begin{aligned}J(\mathbf {w} )\ &=\ {\frac {1}{N}}\sum _{n=1}^{N}H(p_{n},q_{n})\ =\ -{\frac {1}{N}}\sum _{n=1}^{N}\ {\bigg [}y_{n}\log {\hat {y}}_{n}+(1-y_{n})\log(1-{\hat {y}}_{n}){\bigg ]}\,,\end{aligned}}}

(When y(n) is  1 the first term of the formula is calculated, when y(n) is is 0 the second term of the formula is calculated)

Cross-entropy can be used as an error measure when a network’s outputs can be thought of as representing the probability distribution that each hypothesis might be true.

The error measure (cross-entropy) indicates the distance between what the network believes this distribution should be, and what the actual value says it should be.

The cross-entropy measure has been used as an alternative to squared error(SE)/mean squared error(MSE),  in order to determine which model better describes our labeled data points.

It differs from the MSE/SE by which it tends to allow errors to change weights even when nodes derivatives are asymptotically close to 0.

 

more info:

Entropy is the amount of information in a transmitted message, Shanon entropy is expressed as:

Cross-entropy formula:

The Kullback–Leibler divergence is the difference between the cross entropy and the entropy.