Note: this post was slated to appear on May 31, 2014, but events outside of my control (such as grant submission deadlines, and parties at my house) delayed its issuance.
The word "entropy" is used a lot, isn't it? OK, not in your average conversation, but it is a staple of conversations between some scientists, but certainly all nerds and geeks. You have read my introduction to information theory I suppose (and if not, go ahead and start here, right away!) But in my explanations of Shannon's entropy concept, I only obliquely referred to another "entropy": that which came before Shannon: the thermodynamic entropy concept of Boltzmann and Gibbs. The concept was originally discussed by Clausius, but because he did not give a formula, I will just have to ignore him here.
Why do these seemingly disparate concepts have the same name? How are they related? And what does this tell us about the second law of thermodynamics?
This is the blog post (possibly a series) where I try to throw some light on that relationship. I suspect that what follows below isn't very original (otherwise I probably should have written it up in a paper), but I have to admit that I didn't really check. I did write about some of these issues in an article that was published in a Festschrift on the occasion of the 85th birthday of Gerry Brown, who was my Ph.D. co-advisor and a strong influence on my scientific career. He passed away a year ago to this day, and I have not yet found a way to remember him properly. Perhaps a blog post on the relationship between thermodynamics and information theory is appropriate, as it bridges a subject Gerry taught often (Thermodynamics) with a subject I have come to love: the concept of information. But face it: a book chapter doesn't get a lot of readership. Fortunately, you can read it on arxiv here, and I urge you to because it does talk about Gerry in the introduction.
Before we get to the relationship between Shannon's entropy and Boltzmann's, how did they end up being called by the same name? After all, one is a concept within the realm of physics, the other from electrical engineering. What gives?
The one to blame for this confluence is none other than John von Neumann, the mathematician, physicist, engineer, computer scientist (perhaps Artificial Life researcher, sometimes moonlighting as an economist). It is difficult to appreciate the genius that was John von Neumann, not the least because there aren't many people who are as broadly trained as he was. For me, the quote that fills me with awe comes from another genius who I've had the priviledge to know well, the physicist Hans Bethe. I should write a blog post about my recollections of our interactions, but there is already a write-up in the book memorializing Hans's life. While I have never asked Hans directly about his impressions of von Neumann (how I wish that I had!), he is quoted as saying (in the 1957 LIFE magazine article commemorating von Neumann's death: "I have sometimes wondered whether a brain like von Neumann's does not indicate a species superior to man".
The reason why I think that this quite a statement, is that I think Bethe's brain was in itself very unrepresentative of our species, and perhaps indicated an altogether different kind.
So, the story goes (as told by Myron Tribus in his 1971 article "Energy and Information")that when Claude Shannon had figured out his channel capacity theorem, he consulted von Neumann (both at Princeton at the time) about what he should call the "-p log p" value of the message to be sent over a channel. von Neumann supposedly replied:
"You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name. In the second place, and more importantly, no one knows what entropy really is, so in a debate you will always have the advantage.”
The quote is also reprinted in the fairly well-known book "Maxwell's Demon: Entropy, Information, and Computing", edited by Leff and Rex. Indeed, von Neumann had defined a quantity just like that as early as 1927 in the context of quantum mechanics (I'll get to that). So he knew exactly what he was talking about.
Let's assume that this is an authentic quote. I can see how it could be authentic, because the thermodynamic concept of entropy (due to the Austrian physicist Ludwig Boltzmann) can be quite, let's say, challenging. I'm perfectly happy to report that I did not understand it for the longest time, in fact not until I understood Shannon's entropy, and perhaps not until I understood quantum entropy.
Boltzmann defined entropy. In fact, his formula S=klogW is engraved on top of his tombstone, as shown here:
Google "Boltzmann tombstone" to see the entire marble edifice to Boltzmann
In this formula, S stands for entropy, k is now known as "Boltzmann's constant", and Wis the number of states (usually called "microstates" in statistical physics) a system can take on. But it is the logW that is the true entropy of the system. Entropy is actually a dimensionless quantity in thermodynamics. It takes on the form above (which has the dimensions of the constant k) if you fail to convert the energy units of temperature into more manageable units, such as the Kelvin. In fact, k just tells you how to do this translation:
where J (for Joule) is the SI unit for energy. If you define temperature in these units, then entropy is dimensionless
But this doesn't at all look like Shannon's formula, you say?
You're quite right. We still have a bit of work to do. We haven't yet exploited the fact that logW is the number of microstates consistent with a macrostate at energy E. Let us write down the probability distribution w(E) for the macrostate to be found with energy E. We can then see that
I'm sorry, that last derivation was censored. It would have bored the tears out of you. I know because I could barely stand it myself. I can tell you where to look it up in Landau & Lifshitz if you really want to see it.
The final result is this: Eq. (1) can be written as
implying that Boltzmann's entropy formula looks to be exactly the same as Shannon's.
Except, of course, that in the equation above the probabilities wi are all equal to each other. If some microstates are more likely than others, the entropy becomes simply
where the pi are the different probabilities to occupy the different microstate i.
Equation (3) was derived by the American theoretical physicist Willard Gibbs, who is generally credited for the development of statistical mechanics.
Now Eq. (3) does precisely look like Shannon's, which you can check by comparing to Eq. (1) in the post "What is Information? (Part 3: Everything is conditional)". Thus, it is Gibbs's entropy that is like Shannon's, not Boltzmann's. But before I discuss this subtlety, ponder this:
At first sight, this similarity between Boltzmann's and Shannon's entropy appears ludicrous. Boltzmann was concerned with the dynamics of gases (and many-particle systems in general). Shannon wanted to understand whether you can communicate accurately over noisy channels. These appear to be completely unrelated endeavors. Except they are not, if you move far enough away from the particulars. Both, in the end, have to do with measurement.
If you want to communicate over a noisy channel, the difficult part is on the receiving end (even though you quickly find out that in order to be able to receive the message in its pristine form, you also have to do some work at the sender's end). Retrieving a message from a noisy channel requires that you or I make accurate measurements that can distinguish the signal from the noise.
If you want to characterize the state of a many-particle system, you have to do something other than measure the state of every particle (because that would be impossible). You'll have to develop a theory that allows us to quantify the state given a handful of proxy variables, such as energy, temperature, and pressure. This is, fundamentally, what thermodynamics is all about. But before you can think about what to measure in order to know the state of your system, you have to define what it is you don't know. This is Boltzmann's entropy: how much you don't know about the many-particle system.
In Shannon's channel, a message is simply a set of symbols that can encode meaning (they can refer to something). But before it has any meaning, it is just a vessel that can carry information. How much information? This is what's given by Shannon's entropy. Thus, the Shannon entropy quantifies how much information you could possibly send across the channel (per use of the channel), that is, entropy is potential information.
Of course, Boltzmann entropy is also potential information: If you knew the state of the many-particle system precisely, then the Boltzmann entropy would vanish. You (being an ardent student of thermodynamics) already know what is required to make a thermodynamical entropy vanish: the temperature of the system must be zero. This, incidentally, is the content of the third law of thermodynamics.
"The third law?", I hear some of you exclaim. "What about the second?"
Yes, what about this so-called Second Law?
To be continued, with special emphasis on the Second Law, in Part 2.