Cross Entropy – ML – Yair Shinar

The cross-entropy measure is the sum of all the natural logs of each data point’s probability in accordance to the given or calculated labels divided by the number of points times (-1){so it can be positive number}.

It’s simple formula given N points:

${\begin{aligned}J(\mathbf {w} )\ &=\ {\frac {1}{N}}\sum _{n=1}^{N}H(p_{n},q_{n})\ =\ -{\frac {1}{N}}\sum _{n=1}^{N}\ {\bigg [}y_{n}\log {\hat {y}}_{n}+(1-y_{n})\log(1-{\hat {y}}_{n}){\bigg ]}\,,\end{aligned}}$

(When y(n) is 1 the first term of the formula is calculated, when y(n) is is 0 the second term of the formula is calculated)

Cross-entropy can be used as an error measure when a network’s outputs can be thought of as representing the probability distribution that each hypothesis might be true.

The error measure (cross-entropy) indicates the distance between what the network believes this distribution should be, and what the actual value says it should be.

The cross-entropy measure has been used as an alternative to squared error(SE)/mean squared error(MSE), in order to determine which model better describes our labeled data points.

It differs from the MSE/SE by which it tends to allow errors to change weights even when nodes derivatives are asymptotically close to 0.

more info:

Entropy is the amount of information in a transmitted message, Shanon entropy is expressed as:

H(X)=-\sum _{i=1}^{n}p(x_{i})\log p(x_{i}).

Cross-entropy formula:

H(p,q)=-\sum _{x}p(x)\,\log q(x).\!

The Kullback–Leibler divergence is the difference between the cross entropy and the entropy.

${\begin{aligned}D_{\text{KL}}(P\parallel Q)&=-\sum _{x}p(x)\log q(x)+\sum _{x}p(x)\log p(x)\\&=\mathrm {H} (P,Q)-\mathrm {H} (P)\end{aligned}}$

You might also like

Markov Chain

K-Means Clustering

Finding Camera Center