Solving the 1D Ising Model

Today (Wed Week 2) we went through the solution to the 1D Ising model in detail.

Outline of this lecture

Big picture

What are we trying to do?

Our end goal is to find various thermodynamic properties of the 1D Ising model. Remember that thermodynamics means that

  • the system is in thermal equilibrum; where the probability of each configuration/microstate is e^{-beta E} / Z. (This also called the ‘‘canonical ensemble’’)

  • we take the N to infty limit – the system is big,

and we're trying to find various properties of these thermodynamic systems.

What sorts of properties?

  • we're interested in expectation values such as langle sigma_j rangle which tell you the average value of the j'th spin when the system reaches thermodynamic equilibrium.

    • For instance, you can imagine that if there's a very strong magnetic field that wants to align the spins to face downwards, then langle sigma_j rangle will be close to -1. Or that if you heat everything up to very hot, then all the spins are scrambled to be randomly up or down, so langle sigma_j rangle will be close to 0.

    • For the 1D Ising model, langle sigma_j rangle is the same for all values of j. Since the Hamiltonian is translationally invariant (see explanation of symmetry), all the sites are identical, and the average spin will be the same no matter which site you look at.

      • (Maybe for more complicated or realistic models, the sites would be distinct – as in a heterogenous material – and then perhaps langle sigma_j rangle neq langle sigma_i rangle in general)

  • we're also interested in correlation functions such as langle sigma_i sigma_j rangle, which tell you how much spin i and spin j tend to point in the same direction or in opposite directions at equilibrium.

    • For instance, you can imagine that nearby spins are more correlated than faraway spins, because they can interact with eachother more strongly. Or you can imagine that as T to infty, any pairs of spins will become less correlated, since the thermal excitations make all the spins jiggle around more.

    • Again, because of translational invariance (see here for explanation), we expect langle sigma_i sigma_j rangle to only depend on the distance between the spins i-j. For instance, the correlation between spins 2 and 5 is the same as between 100 and 103, since in both cases, the pairs of sites are separated by three lattice spacings.

How do we calculate properties?

Prof. Kivelson outlined the procedure for us on Monday.

  • Our system is defined by a Hamiltonian H, a function that tells us its energy.

  • From the Hamiltonian, we figure out the energy eigenstates, a.k.a. configurations, a.k.a. microstates (or sometimes even ‘‘states’’) of the system. These are the basic states, and we will be summing over these states pretty soon.

    • Yes, these concepts are the same thing! They're just called different words because we learn them in different contexts…

  • Our main goal is to calculate the partition function Z.

     Z = sum_{s} e^{-beta E(s)}.

Here the sum runs over all the states of the system s, E(s) means ‘‘the energy of the system when it's in state s’’, and beta = 1/T is the inverse temperature of our system.

We spend most of our effort trying to figure out how to compute the partition function, which begs the question…

Why is the partition function useful?

Once we've found the partition function Z, we can calculate pretty much everything else. For instance:

  • the free energy F is given by F = - T log Z

  • the thermal average spin langle sigma_j rangle is given by

     langle sigma_j rangle = frac 1 Z sum_{{sigma}} sigma_j e^{-beta H({sigma})}

To explain the notation: I'm summing over all states (this time I call the states {sigma} rather than s). Inside the sum I am multiplying the spin of the j'th site (which is pm 1) by the Boltzmann weight e^{-beta H({sigma})}. The number H({sigma}) is the energy of the system when it's in the state {sigma}, and we find this by plugging in each of the spins sigma_j into the Hamiltonian.

  • the spin-spin correlation langle sigma_i sigma_j rangle, which tells you whether spins i and j tend to align at equilibrium, is given by

     langle sigma_i sigma_j rangle = frac 1 Z sum_{{sigma}} sigma_i sigma_j e^{-beta H({sigma})}

Hopefully, this serves as a sort of useful roadmap for where we're going.

Solving the 1D Ising Model

Game Plan
  1. Rewrite the Hamiltonian as a sum over bonds (rather than sites AND bonds)

  2. Zoom in on a particular bond and write down a transfer matrix which represents the bond from site j to site j+1.

  3. Key step – Notice that summing over sigma_j = pm 1 looks an awful lot like contracting over a shared index, a.k.a. matrix multiplication.

  4. Rewrite Z as the trace of a bunch of transfer matrices multiplied together.

  5. Similarly, rewrite the average spin langle sigma_j rangle and the correlation function langle sigma_i sigma_j rangle in terms of transfer matrices.

These quantities are best calculated in the eigenbasis of the transfer matrix. So our next steps are to

  1. Express the partition function Z in terms of eigenvalues and eigenstates of t. (dificulty: easy)

  2. Express the average spin langle sigma_j rangle in terms of eigenvalues and eigenstates of t. (difficulty: medium)

  3. Express the correlation function langle sigma_i sigma_j rangle in terms of eigenvalues and eigenstates of t. (difficulty: Kivelson)

Afterwards, we will diagonalize the transfer matrix and explicitly calculate these quantities. Throughout these steps, we'll appeal to Pauli matrices and our intuition about the quantum mechanics of spin-half to help us calculate things.

  1. Find the eigenvalues and eigenvectors of the transfer matrix underline{t}

  2. Plug them in to find explicit expressions for Z, langle sigma_j rangle, and langle sigma_i sigma_j rangle.

  3. Profit!?

Writing the Hamiltonian as a sum over bonds

Our first step is to rewrite the Hamiltonian as a single sum over bonds, rather than two separate sums. Later on, we'll see that this form helps us neatly separate the partition function in a nice way.

The Hamiltonian was defined as

 H = - sum_{j} J sigma_j sigma_{j+1} - sum_{j} h sigma_{j}

where the first term is the energy of each of the bonds between neighboring sites, and the second is the energy of each of the sites. Since we want to rewrite the Hamiltonian as a sum over bonds, we re-assign the energy of j'th site to its two neighboring bonds (bond j-1 to j, and j to j+1). (We also introduce a factor one-half to compensate for the fact that each site will now be counted twice). Now the Hamiltonian is sum over bonds

 H = sum_{j} E(sigma_j, sigma_{j+1})

where E(sigma_j, sigma_{j+1}) = -J sigma_j sigma_{j+1} - frac h 2 (sigma_{j} + sigma_{j+1}) is the energy of the bond between sites j and j+1. (Note that E(sigma_j, sigma_{j+1}) takes on four possible values, since there's four combinations of what the spins on sites j and j+1: ++, +-, -+, and --.)

Defining the transfer matrix

With the Hamiltonian written in this form, we can calculate the partition function more easily.

Remember that the partition function is the sum over all states of the Boltzmann weight e^{-beta H}. Since the Hamiltonian H can be written as a sum, the Boltzmann weight e^{-beta H} can be written as a product

 e^{-beta H(sigma_1, sigma_2, ..., sigma_N)}
 = e^{-beta E(sigma_1, sigma_2) - beta E(sigma_2, sigma 3) - ... - beta E(sigma_N, sigma_1)}
 = underbrace{left[e^{-beta E(sigma_1, sigma_2)}right]}_{t_{sigma_1sigma_2}} left[e^{- beta E(sigma_2, sigma_3)}right]  ... left[e^{- beta E(sigma_N, sigma_1)}right]

At this point we introduce the transfer matrix t as a notational trick to make the expression much look nicer. If we rename each of the factors in the product by defining

 t_{sigma_j sigma_{j+1}} = e^{-beta E(sigma_j, sigma_{j+1})},

then e^{-beta H} looks like

 e^{-beta H} = t_{sigma_1 sigma_{2}} t_{sigma_2 sigma_3} ... t_{sigma_{N-1} sigma_{N}} t_{sigma_{N} sigma_1};

ie, there's just a factor of t for each of the bonds in the Hamiltonian.


As another notational trick (remember this whole business with transfer matrices is sort of a notational trick anyways!), we can also rewrite the matrix entries of t_{sigma sigma'} in a quantum-mechanics-esque manner as langle sigma | t | sigma' rangle. In this picture, we interpret the transfer matrix as an operator in the Hilbert space spanned by |+1rangle and |-1rangle. Since we're well versed with manipulating bras and kets from quantum mechanics, this alternative notation might be helpful for building our intution.

Using this bra-ket notation, our expression for e^{-beta H} now looks like

     e^{-beta H(sigma_1, sigma_2, ..., sigma_N)} = langle sigma_1 | t | sigma_2 rangle langle sigma_2 | t | sigma_3 rangle langle sigma_3 | t | sigma_4 rangle ... langle sigma_N | t | sigma_1 rangle.

This notation can be more enlightening, or more confusing, depending on your tastes! Personally, I like this bra-ket notation because my eyes are pretty bad and I find it annoying to squint at subscripts.

Interpreting the transfer matrix

The transfer matrix answers the question, if there's a bond between a spin sigma and another spin sigma', then what is the value of e^{-beta E(sigma, sigma')} for that bond? Since there's two possibilities for the first spin sigma (+1 or -1) and two possibilities for the second spin sigma' (+1 or -1), that means that t is a 2x2 matrix with four entries. (It's a funny sort of matrix where the indices take on values of pm 1.)

To be explicit, we can write the components of t as

 t = left[begin{array}{cc}t_{+1+1} & t_{+1-1}  t_{-1+1} & t_{-1-1}end{array}right] = left[begin{array}{cc}e^{-beta E(+1,+1)} & e^{-beta E(+1,-1)}  e^{-beta E(-1,+1)} & e^{-beta E(-1,-1)}end{array}right].

Rewriting the partition function as a trace

Now that we have a nice expression for e^{-beta H}, we turn back to our task of finding the partition function Z, which means we need to sum over all the possible configurations of the system. How do we do this?

We need to sum over all possible configurations of the N spins; each spin can either be up or down, so there's a total of 2^N configurations. Symbolically, we write this sum as

 sum_{{sigma}} = sum_{sigma_1=-1}^{+1}sum_{sigma_2=-1}^{+1} ...sum_{sigma_N=-1}^{+1};

that is, the first spin takes on values sigma_1 = +1, sigma_1 = -1, the second spin takes on values sigma_2 = +1, sigma_2 = -1, etc., and we're performing this sum for each of the N spins. In class, Prof. Kivelson wrote a confusing expression with a product sign; personally, I'm not a fan. I think that expicitly writing out each of the summations is much more understandable.

Anyways, to find the partition function, we need to calculate

 Z = sum_{{sigma}} e^{-beta H({sigma})}.

When we expand out the sum, we realize the key trick: the transfer matrices are matrix-multiplied with each other, because you sum over the repeated index.

 Z = sum_{sigma_1} sum_{sigma_2} ... sum_{sigma_N} t_{sigma_1 sigma_{2}} t_{sigma_2 sigma_3} ... t_{sigma_{N-1} sigma_{N}} t_{sigma_{N} sigma_1}

Rememeber that matrix multiplication is defined as (AB)_{ik} = sum_j A_{ij}B_{jk}. If we zoom in on the multiplication between the 1-2 transfer matrix and the 2-3 transfer matrix, we see that indeed, the transfer matrices are being multiplied by each other when we sum over their shared index sigma_2:

 ... = sum_{sigma_2}t_{sigma_1 sigma_{2}} t_{sigma_2 sigma_3}     = left(tcdot tright)_{sigma_1 sigma_3}

So when we sum over spin #2, those two transfer matrices ‘‘collapse’’ together and we're left with a squared transfer matrix between spin #1 and spin #3.

If we repeat this process of ‘‘collapsing’’ all the transfer matrices together, we end up with

     Z = sum_{sigma_1} (underbrace{tcdot t cdot t cdot ... cdot t}_{N textrm{times}})_{sigma_1 sigma_1},

which we recognize as the formula for the trace of t^N,

     Z = textrm{Tr} left[t^Nright]!

It's also possible to do this manipulation in the bra-ket notation. (Personally, I find this notation a bit more enlightening.) Remember that a matrix element M_{ij} in bra-ket notation is written as langle i | M | j rangle. If we use this notation, the partition function looks like

 Z = sum_{sigma_1} sum_{sigma_2} ... sum_{sigma_N} langle sigma_1 | t | sigma_2 rangle langle sigma_2 | t | sigma_3 rangle ... langle sigma_N | t | sigma_1 rangle.

If we suggestively scooch the sum signs next to their respective ket-bras, it looks like

 Z = sum_{sigma_1} langle sigma_1 | t underbrace{left[sum_{sigma_2}  | sigma_2 rangle langle sigma_2 | right]}_I t underbrace{left[sum_{sigma_3} | sigma_3 rangle langle sigma_3 | right ]}_I t ...  t underbrace{left[ sum_{sigma_N}| sigma_N rangle langle sigma_N | right]}_I t | sigma_1 rangle

and we realize that there's just a copy of the identity lying between neighboring transfer matrix operators. (Remember the resolution of the identity I = sum_a |arangle langle a|.)

So Z simplifies to

 Z = sum_{sigma_1} langle sigma_1 | t^N | sigma_1 rangle = textrm{Tr} left[t^Nright]

as advertised.

Also, I can't help but to notice how familiar the expression for Z looks! It's reminiscent of the propogator in the path integral formulation of quantum mechanics, where we split up the time evolution operator e^{-iHt} into N chunks, and then sandwich a resolution of the identity between each of the chunks ;)

Rewriting average spin as a trace

Now that we know how the partition function looks like, let's go on to compute the ensemble average of the spin, sigma_j. I'll be using the bra-ket notation since I find it hard to squint at all the puny subscripts.

As with all expectation values, we can calculate the expectation of sigma_j by summing up the value of sigma_j for each microstate, weighted by its probability. This has the form

 langle sigma_j rangle = sum_{{sigma}} sigma_j P({sigma}),

and if we plug in the Boltzmann weights for the probabilities P({sigma}) = e^{-beta H({sigma})} / Z, our expression looks like

 langle sigma_j rangle = frac 1 Z sum_{{sigma}} sigma_j e^{-beta H({sigma})}.

The sum looks quite similar to our expression for Z earlier, with the exception of an extra factor of sigma_j. To see the effect of what that factor does, let's zoom in on everything that involves sigma_j inside the expression:

 ... = sum_{sigma_j} sigma_j |sigma_j rangle langle sigma_j |

How do we make sense of this expression? If there wasn't the factor of sigma_j, this would just be the resolution of the identity, I = sum_a | a rangle langle a |, but it's not immediately obvious what the factor of sigma_j does. Let me try explicitly writing out the sum:

 ... = (+1) bigg[|+1 rangle langle +1 |bigg] + (-1) bigg[|-1 rangle langle -1 |bigg]

This is a matrix with +1 in the (+1,+1) entry and a -1 in the (-1,-1) entry. In other words, it's the Pauli z matrix tau_z defined by

 tau_z |pm1 rangle = pm | pm1 rangle.

Now when we write out the transfer matrices in our sum, all the ket-bras resolve to the identity expect for the j'th one, which instead becomes a Pauli z matrix. So we end up with a tau_z trapped between a bunch of t's in our expression.

     langle sigma_j rangle = sum_{sigma_1} langle sigma_1 | underbrace{t cdot ... cdot t}_{j-1} cdot tau_z cdot underbrace{t cdot ... cdot t}_{N-(j-1)} | sigma_1 rangle = textrm{Tr} left[t^{j-1} tau_z t^{N-(j-1)}right]

Rewriting the spin-spin correlation as a trace

When we find the spin-spin correlation function langle sigma_i sigma_j rangle, a very similar trick happens; you can work out the details for yourself.

The end result is that

 begin{array}{rl} Z &= textrm{Tr} left[t^Nright],  langle sigma_j rangle &= Z^{-1} textrm{Tr} left[t^{j-1} tau_z t^{N-(j-1)}right],  langle sigma_i sigma_j rangle &= Z^{-1} textrm{Tr} left[t^{i-1} tau_z t^{j-i} tau_z t^{N-(j-1)}right],  end{array}

Calculating these traces

We still need to calculate these traces in terms of some basis. Remember from linear algebra that we can calculate traces in terms of any basis we want.

In our case, the best basis to use is the eigenbasis of the transfer matrix t. Since the transfer matrix is a real symmetric 2x2 matrix, it has two orthogonal eigenvectors that span its whole Hilbert space. We'll write these two eigenvectors and their corresponding eigenvalues as

 t|psi_+ rangle = lambda_+ | psi_+ rangle,
 t|psi_- rangle = lambda_- | psi_- rangle.

We won't actually explicitly find these eigenvalues and eigenvectors just yet; I'll leave that for the next section because there's some neat tricks you can do with Pauli matrices by drawing an analogy to a spin-half particle. For now I want to pretend we know the eigenstates already, and just focus on the algebraic manipulations.

(Throughout this section I'll slip into quantum mechanics lingo, and say ‘‘operator’’ in place of ‘‘matrix’’ or ‘‘state’’ in place of ‘‘vector’’)

Simplifying the partition function

What's so nice about the eigenstates of t is that they make it much easier to find t^N. When you apply an operator multiple times to an eigenstate, you just pull out an eigenvalue each time:

 t^N | psi_+ rangle = lambda^N | psi_+ rangle.

(Remember that t is an operator, but lambda_pm are just numbers.)

So if we use this basis to evaluate the partition function, it simplifies very nicely:

 begin{array}{rl}     Z &= langle psi_+ | t^N | psi_+ rangle + langle psi_- | t^N | psi_- rangle           &      &= lambda_+^N langle psi_+ | psi_+ rangle + lambda_-^N langle psi_- | psi_- rangle           &      &= lambda_+^N + lambda_-^N. end{array}

Since eventually we'll be taking the thermodynamic limit N to infty, it's nice to rewrite this expression a bit to understand what happens when we take that limit. We find that

     Z = lambda_+^N left[1 + left(frac {lambda_-}{lambda_+}right)^N right] longrightarrow lambda_+^N quad textrm{as}quad  N to infty

In the expression above, (lambda_- / lambda_+) is less than 1, so when we raise it to a huge power like N, it goes to 0.

Simplifying the average spin

We can do something quite similar for finding the average spin

 langle sigma_j rangle = frac 1 Z textrm{Tr} left[t^{j-1} tau_z t^{N-(j-1)}right],

which can be written explicitly in the eigenbasis of t as

 langle sigma_j rangle = frac 1 Z Bigg[bigglangle psi_+ bigg| t^{j-1} tau_z t^{N-(j-1)} bigg| psi_+ biggrangle + bigglangle psi_- bigg| t^{j-1} tau_z t^{N-(j-1)} bigg| psi_- biggrangleBigg].

I'll describe in words what happens when we sandwich our eigenstates around the operator. The j-1 copies of t to the left of the tau_z can act backwards on the bra-eigenstate to pull out j-1 powers of the eigenvalue (no complex conjugate since t is Hermitian with real eigenvalues!); the other N-(j-1) copies of t to the right of tau_z act forwards on the ket-eigenstate to pull out another N-(j-1) powers of the eigenvalue. We're left with the tau_z trapped inside the sandwich:

     langle sigma_j rangle = frac 1 Z bigg[lambda_+^N langle psi_+ | tau_z | psi_+ rangle + lambda_-^N langle psi_- | tau_z | psi_- rangle bigg]

Again, when we take N to infty, only the first term survives, because the ratio of the terms (lambda_- / lambda_+)^N goes to 0. So we find that

     lim_{Ntoinfty} langle sigma_j rangle = frac 1 Z lambda_+^N langle psi_+ | tau_z | psi_+ rangle.

Diagonalizing the Transfer Matrix

Check out my answer on piazza, where I explain how to find the eigenvalues of t by decomposing it in terms of the Pauli matrices as t = t_0 I + t_x tau_x + t_y tau_y + t_z tau_z.

Check out the picture here for a graphical explanation of why langle psi_+ | tau_z | psi_+ rangle = t_z / sqrt{t_z^2 + t_x^2}. A picture is worth a thousand words!

Work in progress, check back soon…

Our first problem set

To be honest, we pretty much did the problem set during class today.

On the problem set, Prof. Kivelson first asks us to calculate the magnetization density m, which we did in class (!). Later he asks us to express the transfer matrix in terms of Pauli matrices (which we also did in class (!?)) and to discuss the correspondence between the 1D Ising Model and a spin-half quantum system (again, we also did this in class!!!).

It looks like the main purpose of the problem set is

  • to review everything from class today,

  • to gain a better intuition by interpreting the various limits of the magnetization density m, and

  • to apply our techniques from class to solve other systems like the J_1 - J_2 antiferromagnet (with both first and second neighbor interactions) or the X-Y ferromagnet (where each spin takes on an angle theta_j rather than a discrete pm 1).

Anyways, I'm not sure how much I can discuss here without violating the honor code. But I did have an interesting remark on the definition of magnetization density.


tl;dr: m = langle sigma_j rangle is a subtle statement.

On the problem set, the magnetization density is defined as

     m equiv N^{-1} sum_{j = 1}^{N}langle sigma_j rangle,

which seems sort of bizarre since we add up N terms and then divide by N again. What's going on here?

The answer has to do with intensive vs extensive quantities, and experimental observables vs mathematical expressions.

To start off, notice that the average of any particular microscopic spin sigma_j is impossible to measure! (Spin j would have a puny magnetic field, and besides, there's a whole bunch of other spins nearby that would mess up the measurement.) Our actual experimental observable is not the magnetization of one particular spin, but rather the magnetization of the whole magnet, which is given by

     M equiv sum_{j=1}^{N} langle sigma_j rangle,

That is, when we perform an experiment, we measure the total magnetization of the magnet, which we get by adding up the contributions from each of the spins. (Notice this is capital M, not lowercase m!)

The problem with total magnetization M is that it's an extensive rather than intensive property – that is, it scales with the system size N. If we doubled the size of the magnet, we would also double the total magnetiation M. Now, we don't want to use an extensive property that grows with N, since it'll blow up when we take the thermodynamic limit N to infty. Rather, we want to use an intensive property independent of system size, a density rather than a total quantity.

To solve this conundrum, just divide the total magnetization by N. We define the magnetization density m as

     m = frac M N = N^{-1} sum_{j = 1}^{N}langle sigma_j rangle,

and voila, m no longer blows up as N to infty. So we've achieved our goal of finding an experimental quantity that's intensive.

Finally, just to make this whole affair even more ridiculous, it turns out that langle sigma_j rangle is actually the same value for all j! (Remember, the Ising model Hamiltonian is translationally invariant.) So when we find m, we're just adding up the same number N times, and then dividing by N…and at the end of the day, the magnetization density is just the same as the average spin m = langle sigma_j rangle.

Leave a Comment Below!

Comment Box is loading comments...