# I = – log(p(x))

I : the amount of Information,
p(x) : the probability to observe an event x

I watched a lecture video by Ali Ghodsi, a professor in the computer science department at the University of Waterloo, explaining “Variational Autoencoder.”

If an event x happens with high probability, it contains little information. Whereas, if x happens with low probability, then it contains much information.

He asked the class, “Is this intuitive?”

Then, he goes, “if something predictable happens, we do not have much interest in it. However, we will be very much interested in something unpredictable. For example, D. Trump is unpredictable, which makes him always on the news. While, J. Trudeau (the prime minister, Canada) is predictable and is not on the news.”

What an easy explanation for students to grasp $I=-log(p(x))$.

