Schur-Weyl Duality

In this post, I discuss Schur-Weyl duality, its proof and some of its applications to the field of quantum information. With all the background, this post ended up being longer than I anticipated. I might stick to shorter posts in the future. There is some material that may be new, such as the dual version for the Schur transform and the description of immanants. I have a feeling that this may be known to some experts and just not written up anywhere.

1. Statement of Schur-Weyl duality

Schur-Weyl duality (in one form) says that the diagonal action of the general linear group commutes with the action of the symmetric group on the tensor product of finite dimensional vector spaces. Consider the ${n}$-fold tensor product ${V^{\otimes n}}$ of a finite dimensional vector space ${V}$ of dimension ${d}$. We have the diagonal action of the general linear group ${\text{GL}(V)}$, which applies an invertible operator ${A}$ on each tensor copy. For this action of the general linear group, we are interested in the centralizer algebra i.e., ${\text{End}_{\text{GL}(V)}V^{\otimes n}}$. It is easy to see that the set of operators that permute the tensor copies are in this algebra and the question is – are there more? Schur-Weyl duality is the statement that there are no more operators and so, this centralizer algebra is isomorphic to the symmetric group algebra. We will actually consider the action of the so-called universal enveloping algebra ${U(\mathfrak{gl}(V))}$ of the Lie algebra ${\mathfrak{gl}(V)}$ instead of the general linear group. Schur-Weyl duality can then be stated as

Theorem 1 The image of ${\mathbb{C}[S_n]}$ and the image of ${U(\mathfrak{gl}(V))}$ in ${\text{End}_{\mathbb{C}}(V^{\otimes n})}$ are centralizers of each other.

Before we prove this theorem, let us recall some facts about universal enveloping algebras of Lie algebras.

2. Universal enveloping algebras

A universal enveloping algebra of ${\mathfrak{g}}$ is a unital associative algebra such that its Lie algebra is ${\mathfrak{g}}$. In order to motivate this, first recall that a Lie algebra is not associative. In fact, the Jacobi identity expresses this non-associativity. But, why do we care about having an associative algebra? The main reason is that we are usually interested in representations of Lie algebras rather than the algebras themselves. This means that if ${\rho}$ is a representation of the Lie algebra, then one would like to say that ${\rho([[g_1, g_2]\dots])=\rho(g_1)\rho(g_2)\dots}$, where ${[\cdot,\cdot]}$ is the multiplication in the Lie algebra. But this is not the case with Lie algebra representations. We would like to have an associative algebra, so that we have this property for its representations.

Now, from any associative algebra ${A}$, one can construct a Lie algebra ${A_L}$ by simply defining the Lie bracket to be the commutator i.e., ${[x,y]=xy-yx}$. It is easy to check that this satisfies the axioms of a Lie algebra. So, a natural question is the converse, that is, given a Lie algebra ${\mathfrak{g}}$, what is the associative algebra ${A}$ such that ${A_L=\mathfrak{g}}$? We do not know if such an object exists and even if it exists, it may not be unique. For some problems, existence can often be shown by a direct construction, but uniqueness can become problematic. In order to solve the uniqueness problem, we require that the object we are searching for, satisfy a universal property. This universal property says that if we find any other algebra ${B}$ whose Lie algebra ${B_L}$ is also ${\mathfrak{g}}$, then there exists a unique algebra isomorphism ${f}$ from ${A}$ to ${B}$ such that ${B_L=(f(A))_L}$. This guarantees that the algebra we find is the only algebra (up to a unique isomorphism) whose Lie algebra is ${\mathfrak{g}}$. This also ensures that the representations of the Lie algebra (which are essentially homomorphisms from ${\mathfrak{g}}$ to an associative algebra) are in one to one correspondence with the representations of ${U(\mathfrak{g})}$. This also means that Schur-Weyl duality, which we will prove for the action of ${U(\mathfrak{g})}$, is true for the action of ${\mathfrak{g}}$ (and also the Lie group of ${\mathfrak{g}}$). Now for the formal definition of ${U(\mathfrak{g})}$.

Definition 2 Given a Lie algebra ${\mathfrak{g}}$, a unital associative algebra ${U}$ (with a Lie algebra homomorphism ${f:\mathfrak{g}\rightarrow U_L}$) is said to be a universal enveloping algebra of ${\mathfrak{g}}$ if it satisfies the following universal property: for any unital associative algebra ${B}$ and a Lie algebra homomorphism ${g:\mathfrak{g}\rightarrow B_L}$, there exists a unique algebra homomorphism ${h:U\rightarrow B}$ such that ${g=h_L(f)}$.

Here we will construct the universal enveloping algebra, but we will not prove the universal property. This is proved, for instance, in chapter 3 of Knapp’s book. From any vector space (such as ${\mathfrak{g}}$), one can construct an associative algebra called the tensor algebra ${T(\mathfrak{g})}$, which satisfies the universal property with respect to ${\mathfrak{g}}$. The tensor algebra is defined as

$\displaystyle T(\mathfrak{g})=\bigoplus_{n=0}^{\infty} T^n(\mathfrak{g})\,, \ \ \ \ \ (1)$

where ${T^n(\mathfrak{g})}$ is the ${n}$ fold tensor product of ${\mathfrak{g}}$ with itself. So, for instance, ${T^0}$ is the ground field and ${T^1}$ is ${\mathfrak{g}}$. The tensor algebra is a graded algebra which means that it has a filtration ${T^n}$ such that ${T^nT^m\subset T^{n+m}}$ and ${T}$ is a direct sum of the ${T^n}$. A filtered algebra is more general; it satisfies the first condition and instead of the second condition, we would have ${T=\cup_n T^n}$ i.e., ${T}$ need not be the direct sum of ${T^n}$. The tensor algebra ${T(\mathfrak{g})}$ has two interesting algebras as its quotients. The symmetric algebra ${S(\mathfrak{g})}$ (which we will encounter again later in the proof of Schur-Weyl duality) and the exterior algebra ${\Lambda (\mathfrak{g})}$ are quotients of the tensor algebra by the ideals ${I_1=a\otimes a}$ and ${I_2=a\otimes b-b\otimes a}$ respectively (for ${a,b\in T^1(\mathfrak{g})}$). It is easy to see that the symmetric and alternating algebras are graded with ${S^n=T^n/I_1}$ and ${\Lambda^n=T^n/I_2}$. It turns out that one can construct the universal enveloping algebra of ${\mathfrak{g}}$ as a quotient of ${T(\mathfrak{g})}$ as well. To include the Lie structure into this algebra, we impose the Lie bracket by modding out by the ideal generated by ${I=a\otimes b - b\otimes a - [a,b]}$ for ${a,b\in T^1(\mathfrak{g}))}$. So, the universal enveloping algebra is ${U(\mathfrak{g})=T(\mathfrak{g})/I}$. Since ${I}$ is not generated by homogeneous elements as in symmetric and alternating algebras, the quotient ${U(\mathfrak{g})}$ is filtered but not graded. The filtration is given by ${U_n(\mathfrak{g})=(T^n(\mathfrak{g})+I)/I}$. For any filtered algebra, one can construct the so-called associated graded algebra. In the case of ${U(\mathfrak{g})}$, the associated graded algebra turns out to be isomorphic to the symmetric algebra ${S(\mathfrak{g})}$. This is part of the content of the Poincaré-Birkhoff-Witt (PBW) theorem. The PBW theorem also gives a basis for ${U(\mathfrak{g})}$ in terms of an ordered basis for ${\mathfrak{g}}$. To construct the associated graded algebra of ${U(\mathfrak{g})}$, we define the grading as ${U^n=U_n/U_{n-1}}$. This can be shown to be isomorphic to the grading of the symmetric algebra. Using the PBW theorem, one can show that ${U_n}$ is generated by the elements ${g_1^{i_1}\dots g_n^{i_n}}$, where ${\sum_k i_k=n}$ and ${g_1,\dots g_n}$ is an ordered basis for ${\mathfrak{g}}$. Now, we are ready to prove Schur-Weyl duality.

3. Proof of Schur-Weyl duality

As mentioned earlier, we will prove this duality between ${U(\mathfrak{gl}(n,\mathbb{C}))}$ and ${S_n}$. But this can be extended to the one between ${\text{GL}(n,\mathbb{C})}$ and ${S_n}$ as well as the one between ${\text{U}(n,\mathbb{C})}$ and ${S_n}$.

Proof: Denote the image of ${\mathbb{C}[S_n]}$ in ${\text{End}_{\mathbb{C}}(V^{\otimes n})}$ by ${A}$, the image of ${U(\mathfrak{gl}(V))}$ by ${B}$ and ${V^{\otimes n}}$ by ${W}$. Now, observe that from Maschke’s theorem, it follows that ${A}$ is semisimple. This means that one can decompose ${W}$ into irreducible invariant spaces of ${S_n}$ in the following way.

$\displaystyle W=\bigoplus_{i}(M_i\otimes V_i)\,, \ \ \ \ \ (2)$

where ${V_i}$ are (spaces corresponding to) irreducible representations of ${S_n}$ and ${M_i=\text{Hom}_A(V_i,W)}$, i.e., the set of maps embedding ${V_i}$ into ${W}$ or the multiplicity spaces. Applying Schur’s lemma, we see that one can identify ${\text{End}_AW}$ with the direct sum ${\oplus_i \text{End}(M_i)}$. In particular, this means that ${\text{End}_AW}$ is semisimple.

The next step is to observe that since we are dealing with the commutator of the symmetric group, we have that ${\text{End}_AW}$ is the symmetric space ${S^n(\text{End}(V))}$. Now, let us look at the image of the universal enveloping algebra ${U(\mathfrak{gl}(V))}$ on ${W}$. The image is generated by elements of the form

$\displaystyle u=g\otimes \dots\otimes I + \dots +I\otimes \dots\otimes g\,, \ \ \ \ \ (3)$

where ${g}$ lies in ${\mathfrak{gl}(V)}$. It is clear that this element ${u}$ lies in ${U_n}$ and that all elements that lie in ${U_{n-1}}$ are not in this image. If we show that the kernel does not contain any more elements of ${U_n}$, then we have that the image is ${U_n/U_{n-1}}$, which is isomorphic to ${S^n(\mathfrak{gl}(V))}$ by the PBW theorem and we would be done. In order to show that there are no more elements in the kernel, we can examine the elements in ${U_n/U_{n-1}}$. These are generated by elements of the type ${g_1^{i_1}\dots g_k^{i_k}}$, where ${\sum_j i_j =n}$. It can be seen that these are not in the kernel. $\Box$

4. Irreducible representations of Lie algebras

For any simple Lie algebra ${\mathfrak{g}}$, there exist a set of operators ${H_i}$ in the algebra such that ${[H_i,H_j]=0}$ (this is sometimes confusingly stated as commutation of ${H_i}$ and ${H_j}$. The bracket here is multiplication in the Lie algebra and not commutation. But if one considers a matrix representation of ${H_i}$, then one could think of it as commutation. Perhaps, a better way is to think of the ${H_i}$ is as commuting operators in the adjoint action as we will see below). The maximal such set is called the Cartan subalgebra ${\mathfrak{h}}$, whose dimension ${r}$ is called the rank of the Lie algebra. This plays a central role in the structure of a Lie algebra. Since ${\mathfrak{g}}$ is a vector space, there also exists an action of the algebra on itself via the Lie bracket. This action is called the adjoint representation and can be written as ${ad_X|Y\rangle=|[X,Y]\rangle}$, for any ${X,Y}$ in the Lie algebra thought of as states. Now, in terms of the adjoint representation, we can see that the ${ad_{H_i}}$ commute with each other (we need to use the Jacobi identity to see this). This means that they are simultaneously diagonalizable. Moreover, it can be seen that all the ${|H_i\rangle}$ are in the null space of each of the ${ad_{H_j}}$. If we call the eigenvectors of the rest of the eigenspaces ${|E^\alpha\rangle}$ (where ${\alpha}$ ranges over some set), we have that ${ad_{H_i}|E^\alpha\rangle=\alpha^i|E^\alpha\rangle}$ for some eigenvalue ${\alpha^i}$. It turns out that these eigenspaces are one-dimensional (if they are not, then it means that we have not chosen a maximal set of commuting operators for ${\mathfrak{h}}$). It is easy to show that ${|[E^\alpha,E^\beta]\rangle}$ are eigenvectors with eigenvalues ${\alpha^i+\beta^i}$ and the ${|E^{\alpha\dag}\rangle}$ has eigenvalue ${-\alpha^i}$. If we define ${H^\alpha}$ as ${\sum_i\alpha^iH_i}$, then ${E^\alpha, E^{-\alpha}}$ and ${H^\alpha}$ form an ${\mathfrak{sl}_2(\mathbb{C})}$ subalgebra inside this Lie algebra. This is useful to determine the structure of finite dimensional representations of ${\mathfrak{g}}$. A last thing to note here before we move on to the dual space is that there exists a symmetric, bilinear form on the Lie algebra, called the Killing form, defined as ${B(X,Y)=(1/2g)\text{Tr}(ad_Xad_Y)}$ (here ${g}$ is the dual Coxeter number of ${\mathfrak{g}}$, but let’s not worry about that now).

[It is interesting that the existence of SICs or a complete set of MUBs can be related to existence of certain structures in Lie algebras. Boykin et al., show that the number of Cartan subalgebras of ${\mathfrak{sl}_n(\mathbb{C})}$ which are pairwise orthogonal with respect to the Killing form corresponds to the number of mutually unbiased bases in ${\mathbb{C}^n}$ (if the Cartan subalgebra is closed under Hermitian conjugation). Appleby et al., showed that the existence problem of SIC-POVMs can be related to the existence of a particular basis of ${\mathfrak{gl}_n(\mathbb{C})}$.]

One can construct the dual space ${\mathfrak{h}^\ast}$ of ${\mathfrak{h}}$ as the space of linear superpositions of maps ${\alpha}$, where ${\alpha(H_i)=\alpha^i}$. This dual space is called the root space and the ${\alpha}$ are called roots. It turns out that the root space can be split into positive roots and negative roots of equal cardinality. One way to do this is to simply pick an arbitrary ordered basis for the root space and call positive roots those whose first non-zero coordinate is positive. For an arbitrary finite dimensional representation ${\pi}$ of ${\mathfrak{g}}$, one can, as before, simultaneously diagonalize ${\pi(H_i)}$ and consider the eigenvalues of the common eigenspaces. This gives us elements ${\lambda}$ of the dual space, called weights, such that ${\lambda(H_i)=\lambda^i}$, where ${\lambda^i}$ is the eigenvalue. Unlike in the adjoint representation, the common eigenspaces of the ${H_i}$ in an arbitrary finite dimensional representation need not be one-dimensional. In other words, the weights can have multiplicity. However, if ${\pi}$ is irreducible (like the adjoint representation), the highest and the lowest weights are unique. Finally, one can define an inner product on the root space by using the Killing form i.e., ${(\alpha,\beta)=B(H^\alpha,H^\beta)}$. This is clear when ${\alpha}$ and ${\beta}$ are roots and by linearity, one can extend this to all of ${\mathfrak{h}^\ast}$.

Let us now discuss how to describe the weights of a given finite dimensional irreducible representation. Using the definition of positive roots, one can define a special class of roots called simple roots as those positive roots that cannot be written as a sum of positive roots. It turns out that simple roots span the root space and essentially encode the structure of the Lie algebra. However, they are not always a convenient basis. Often one uses a different basis called the basis of fundamental weights. First, define the so called coroots as ${\check{\alpha}_i=(2\alpha_i)/|\alpha_i|^2}$ for each simple root ${\alpha_i}$ (not to be confused with ${\alpha^i}$, which is the eigenvalue ${\alpha(H_i)}$). Now, fundamental weights ${\omega_i}$ are defined as the duals of coroots i.e., ${(\omega_i,\check{\alpha}_j)=\delta_{i,j}}$. In the basis of fundamental weights, the components of any weight are called Dynkin labels and there is a correspondence between the Dynkin labels of the highest weight of a finite dimensional irreducible representation and Young diagrams (which are partitions of some integer). Dynkin labels of highest weights are positive integers ${(\lambda_1,\dots ,\lambda_r)}$ and one can create a partition ${(l_1,\dots,l_r)}$, where ${l_i=\lambda_i+\dots+\lambda_r}$. As you can see, the partitions need to be of the same integer. For weights that are not the highest weight, the Dynkin labels can be negative and need not correspond to valid partitions. Corresponding to each weight, there are weight vectors and all the weight vectors span the space of the irreducible representation. For a given weight ${w}$, we label the weight vector as ${|w,i\rangle}$, where ${i}$ label any multiplicity it may have.

Now, we are ready to discuss the Clebsch-Gordan decomposition of two irreducible representations. When we take the tensor product of two irreducible representations, we have a representation of the group by what is called the diagonal action i.e., if ${\lambda}$ and ${\rho}$ are two representations, then ${\lambda(g)\otimes\rho(g)}$ is the diagonal action. This is in general a reducible representation and its decomposition into irreducibles is called the Clebsch-Gordan (CG) decomposition. For the unitary group, there is a rule that tells us which irreducible representations appear in the decomposition. If we use Young diagrams to label the irreps, then the Littlewood-Richardson rule gives the different irreps ${\nu}$ and the multiplicity ${c^\nu_{\lambda\mu}}$ in the decomposition. (In using Young diagrams, one can remove any column of ${N}$ rows.) The weights occurring in the decomposition are linear combinations of the form

$\displaystyle |v\rangle_{\lambda\otimes\mu}=\sum_{i,j,w,w^\prime}c(i,j,w,w^\prime) |w,i\rangle\otimes|w^\prime,j\rangle\,, \ \ \ \ \ (4)$

where ${i,j}$ label the multiplicity of the weight and the coefficient ${c}$ is non-zero only if ${w+w^\prime=v}$. There are systematic ways to calculate the CG coefficients for the unitary group using the Wigner-Eckart theorem and reduced Wigner coefficients.

[The story of Littlewood-Richardson (LR) rule is quite interesting. In their paper, where Littlewood and Richardson introduce this famous rule, they have a proof for a special case (when one of the partitions has only two rows). However, it is stated as a theorem in general but with a note saying something like – no simple proof has been found for the general case. It seems unlikely that they had a proof (simple or otherwise), given that the first correct proof was found about 40 years later. Shortly after the LR paper, Robinson published a claimed proof. Even though it had several gaps, which were discovered after a long time, his technique eventually led to a correct proof. His method was rediscovered by Schensted and explored further by Knuth and has become the famous RSK correspondence. It appears that the first complete proof of the LR rule was given by Schützenberger using his jeu de taquin. Other correct proofs individually by Thomas, Macdonald and Zelevinsky followed. This interesting story is recounted at the end of this paper.]

5. Applications

We are primarily interested in applications to quantum information theory. Here we will look at three applications.

5.1. Schur transform

The space ${W=V^{\otimes n}}$ has the basis ${|i_1,\dots i_n\rangle}$, where ${i_j\in\{1,\dots d\}}$ (here ${V}$ has the basis ${|1\rangle,\dots |d\rangle}$). The action of the unitary and symmetric groups can be block diagonalized and this basis can be labeled ${|\lambda,a,b\rangle}$, where ${\lambda}$ labels an irreducible representation of ${S_n}$ (or ${U(d)}$) and ${a}$ and ${b}$ are states in the irreducible spaces of ${S_n}$ and ${U(d)}$ respectively. The Schur transform, then, is the unitary matrix that rotates the computational basis to the block diagonal basis. In this paper, Bacon, Chuang and Harrow construct an efficient quantum circuit for the Schur transform. In order to do this, they first construct the Clebsch-Gordan (CG) transform for the unitary group and use it to decompose the ${U^{\otimes n}}$ action. To see how the CG transform can be used to construct the Schur transform, suppose that we have decomposed into ${U^{\otimes k}}$ into irreducible spaces and want to go to the action of ${U^{\otimes (k+1)}}$, then given any irreducible representation label ${\lambda^{(k)}}$ (at the ${k^{\text{th}}}$ level) and a state ${|j \rangle}$ in that irreducible space, we can use the CG transform to decompose the tensor product ${|\lambda^{(k)},j\rangle\otimes |i\rangle}$, where ${|i\rangle}$ stands for a state in the fundamental representation (i.e., the ${k+1^{\text{th}}}$ tensor copy). The CG transform gives us ${|\lambda^{(k)}\rangle\sum_\mu |\mu^{(k+1)},m_\mu\rangle}$. Here the sum is over all unitary group irreducible representations ${\mu}$ that appear in the CG decomposition and ${|m_\mu\rangle}$ is a state in the irreducible space ${\mu}$. It turns out that keeping track of the labels for each ${k}$ automatically gives us a basis of the irreducible space of the symmetric group.

A dual version of the Schur transform

A natural question is whether there is a way to approach the Schur transform using the symmetric group, rather than the unitary group. In order to do this, let us look at the representations in terms of the symmetric group. Recall that as the computational basis for ${W}$, we picked a basis for each ${V}$ as the set ${\{|1\rangle,\dots,|d\rangle\}}$ and using this we construct a basis for ${V^{\otimes n}}$. Given any basis vector of ${V^{\otimes n}}$, define its type ${T}$ as the ordered set ${(i_1,\dots i_d)}$, where ${i_j}$ is the number of times ${|j\rangle}$ appears in the tensor product. Clearly, the action of the symmetric group permutes all the vectors of a given type in a transitive fashion. This means that the subspace ${V_T}$ spanned by these vectors of type ${T}$ is an induced representation of ${S_n}$ — induced from the trivial representation of the Young subgroup ${S_T=S_{i_1}\times\dots\times S_{i_d}}$ (note that ${S_T}$ can be isomorphic for two different types). If we can block diagonalize these induced representations, we can use it to construct the Schur transform. Let us look at this in more detail. First, we want to convert the basis from ${|j_1,\dots,j_n\rangle}$ to ${|T,t\rangle}$, where ${T}$ is the type of the vector and ${t}$ is the transversal of ${S_T}$ in ${S_n}$. This transform can be done efficiently since the type and the transversal can be efficiently computed from the basis vector and vice versa. Now, we need to block diagonalize each of these induced representations. Fortunately, we can use the algorithm for the Fourier transform over ${S_n}$ by Beals (see here). There is still a problem since the Fourier transform decomposes the regular representation of the symmetric group, which is the induced representation from the trivial subgroup, not induced from ${S_T}$. However, by using the QFTs over ${S_n}$ and ${S_T}$, we can construct the transform for induced representations.

We describe here an algorithm to block diagonalize the induced representation from a subgroup ${H}$ to ${G}$ of an irreducible representation ${\rho}$ of ${H}$, if one can perform a QFT over ${G}$ and ${H}$ (and a couple of other conditions, which we explain below). Take a state in the induced representation i.e., a state of the form ${|t,v\rangle}$ where ${t}$ is an element of the transversal and ${|v\rangle}$ is a state in the representation ${\rho}$. Append it with an ancilla of dimension ${|H|/d_\rho}$, where ${d_\rho}$ is the dimension of ${\rho}$. Now apply the inverse QFT over ${H}$ to rotate this to the computational basis of ${H}$. The register now contains states of the form ${|t,h\rangle}$, where ${h}$ is an element of ${H}$. The first of the two conditions mentioned above is that we need the QFT over ${G}$ to rotate from this basis (which can be thought of as the computational basis) to the block diagonal basis. Next we apply the QFT over ${G}$ to get a basis ${|\lambda,m,i\rangle}$, where ${\lambda}$ is an irreducible representation of ${G}$ and ${m}$ indexes its multiplicity and ${i}$ is a state in the irreducible space. Now, we need the second condition. We need to be able to re-index the irrep labels so that the label register is of the form ${|0\rangle\otimes |\lambda\rangle}$. The first register is, of course, of dimension ${|H|/d_\rho}$. Now, we can return the ancilla register and we have our transform. Let us check that the two conditions are satisfied in our case i.e., when ${G=S_n}$ and ${H=S_T}$. In Beals’ algorithm for the QFT over ${S_n}$, the computational basis is assumed to be a word in terms of the generating set of ${S_n}$ consisting of adjacent transpositions i.e., ${\{(1,2),\dots, (n-1,n)\}}$. So, writing our ${(t,h)}$ pair as a sequence using this set should be easy. The second condition is also satisfied since the irreducible representations of ${S_n}$ that appear in the induced representations from ${S_T}$ to ${S_n}$ are those that are greater than the partition corresponding to ${T}$ (here greater is with respect to the dominance order of partitions). In fact, the multiplicities in that induced representation is given by the so called Kostka number. Therefore, if we use a labeling of partitions that respects dominance order, then we automatically satisfy the second condition. (We don’t really have to worry about the second condition since it would just lead to a slightly inefficient encoding.)

Now, the Schur transform is the following sequence of transformations. First, take the basis from ${|j_1,\dots,j_n\rangle}$ to ${|T,t\rangle}$ (as discussed above). Then to ${|T,\lambda,m,i\rangle}$, where ${\lambda}$ is a partition labeling an irreducible representation of ${S_n}$, ${m}$ labels the multiplicity space and ${i}$ the representation space. We have our transform by rearranging the registers to ${|\lambda,i,(T,m)\rangle}$, where ${|T,m\rangle}$ is an orthonormal basis for the irreducible space of the unitary group.

5.2. Quantum de Finetti theorem

A version of the quantum de Finetti theorem states that given a joint density operator on ${n}$ systems with permutational symmetry, the partial trace over ${n-k}$ systems can be approximated as a convex sum of product states. Quantum de Finetti theorems have been applied to several areas of quantum information science such as security proof of quantum key distribution schemes against coherent attacks. Here, we discuss a proof of the quantum de Finetti theorem, which uses representation theory of Lie algebras and Schur-Weyl duality. This is from this paper of Koenig and Mitchison.

Suppose that we have a tensor product of two irreducible representations ${\lambda\otimes\mu}$ of a Lie algebra and we are interested in states in an irreducible representation ${\nu}$ inside the tensor product. Here ${\lambda,\mu}$ and ${\nu}$ are the highest weights. We are specifically interested in the partial trace over the ${\mu}$ system of any state in ${\nu}$ and how we can write this as a convex sum over certain states in ${\lambda}$. Suppose that we have a state ${|\Psi\rangle}$ in ${\nu}$, which is a weight vector. Since ${\nu}$ appears in the decomposition of ${\lambda\otimes\mu}$, the weight vectors of ${\nu}$ can be written as linear combinations of the type

$\displaystyle |\Psi\rangle=\sum_{w,w^\prime} c(w,w^\prime)|w\rangle\otimes|w^\prime\rangle\,, \ \ \ \ \ (5)$

where the sum of the weights of ${w}$ and ${w^\prime}$ is the weight of ${|\Psi\rangle}$. Now, consider the partial trace

$\displaystyle \text{Tr}_\mu(|\Psi\rangle\langle\Psi|)=d_\mu\int\text{Tr}_\mu\left[(P_\lambda\otimes g|\mu\rangle\langle\mu|g^\dagger)(|\Psi\rangle\langle\Psi|)\right]dg\,, \ \ \ \ \ (6)$

where the integration is with respect to the Haar measure. The above equation follows from Schur’s lemma on the ${\mu}$ system. Notice that the integrand is a pure state i.e.,

$\displaystyle \text{Tr}_\mu\left[(P_\lambda\otimes g|\mu\rangle\langle\mu|g^\dagger)(|\Psi\rangle\langle\Psi|)\right]dg=d_\mu|\chi_g\rangle\langle\chi_g|dm(g)\,. \ \ \ \ \ (7)$

This means that the partial trace is

$\displaystyle \text{Tr}_\mu(|\Psi\rangle\langle\Psi|)=d_\mu\int|\chi_g\rangle\langle\chi_g|dm(g)\,, \ \ \ \ \ (8)$

where ${m}$ is the induced probability measure.

Now consider the set ${\Lambda=\{w|w+\mu\in W\}}$, where ${W}$ is the set of weights occurring in ${\nu}$ and let ${P}$ be the projector onto the space spanned by these weight vectors. We have that

$\displaystyle |\chi_g\rangle\langle\chi_g|= \text{Tr}_\mu\left[(P_\lambda \otimes |\mu\rangle\langle\mu|)(I\otimes g^\dagger)|\Psi\rangle\langle\Psi|(I\otimes g))\right]\,. \ \ \ \ \ (9)$

But since ${\mu}$ is the highest weight, we can replace ${P_\lambda}$ by ${P}$ i.e., ${P_\lambda\otimes |\mu\rangle\langle\mu|=P\otimes |\mu\rangle\langle\mu|}$. This means that we can “approximate” the partial trace of ${\Psi}$ exactly with states from ${\Lambda}$. In general, suppose we have another set of states (instead of the ones in ${\Lambda}$) and we start from another state ${|\phi\rangle}$ (instead of ${|\mu\rangle}$). Let the projector onto this new set of states be ${P^\prime}$, then we can define the states ${|\chi^\prime_g\rangle=P^\prime_g|\chi_g\rangle}$ (normalized appropriately), where ${P^\prime_g=gP^\prime g^\dagger}$. The integral ${\int|\chi^\prime_g\rangle\langle\chi_g^\prime|dm(g)}$ can be used to approximate the partial trace ${\text{Tr}_\mu(|\Psi\rangle\langle\Psi|)}$ and the error (i.e., the trace norm of the difference) in approximation is upper bounded by ${2\sqrt{1-\delta}}$. The error is first related (via the gentle measurement lemma and triangle inequality) to

$\displaystyle \delta(|\phi\rangle)=\int\langle\chi_g |P^\prime |\chi_g\rangle dm(g)\,. \ \ \ \ \ (10)$

By plugging back the expression for ${|\chi_g\rangle}$ and applying Schur’s lemma, this can be calculated to be

$\displaystyle \delta(|\phi\rangle)=\frac{d_\mu}{d_\lambda}\text{Tr}\left[P_\nu(P^\prime\otimes |\mu\rangle\langle\mu|)\right]\,. \ \ \ \ \ (11)$

Defining ${\delta}$ as the supremum of ${\delta(|\phi\rangle)}$ over all states ${|\phi\rangle}$, we get the upper bound. This expression for the error gives us the exponential de Finetti theorem. For this, we take the representation with highest weight ${\lambda}$ to be ${\text{Sym}^k(\mathbb{C}^d)}$, ${\mu}$ to be ${\text{Sym}^{n-k}(\mathbb{C}^d)}$ and ${\nu}$ to be ${\text{Sym}^n(\mathbb{C}^d)}$.

5.3. Spectrum of a density operator

The spectrum of many copies of a density operator turns out to be related to Schur-Weyl duality. There are many consequences of this and several related problems. But here, we discuss only one aspect of it. It was originally proved by Keyl and Werner. Hayashi and Matsumoto gave a proof using Schur-Weyl duality. The following is from Christandl and Mitchison.

If ${\rho}$ is a density operator (i.e., a positive self-adjoint operator with unit trace) in a Hilbert space ${V}$ of dimension ${d}$, then the spectrum of ${\rho^{\otimes n}}$ is closely related to Schur-Weyl duality. Since the operator ${\rho^{\otimes n}}$ commutes with permutations of tensor copies, it “lives” in the irreducible spaces of the unitary group. It turns out that as ${n}$ increases it mostly lives in only one particular space. That irreducible representation ${\lambda}$ (or rather its normalized version) tends to the spectrum of ${\rho}$. If ${\lambda}$ is a partition of ${n}$, then the normalized version of ${\lambda}$ is ${(\lambda_1/n,\dots,\lambda_d/n)}$. To state this result more precisely,

Theorem 3 Suppose the spectrum of ${\rho}$ is ${r}$ and ${P_\lambda}$ is the projection onto the isotypic space corresponding to the Young frame ${\lambda}$, then

$\displaystyle \text{Tr}(P_\lambda\rho^{\otimes n})\leq (n+1)^{d(d-1)/2}\exp(-nD(\bar\lambda||r))\,, \ \ \ \ \ (12)$

where ${D}$ is the usual Kullback-Leibler distance between probability distributions.

Proof: First pick the eigenbasis ${|v_i\rangle}$ of ${\rho}$ as the basis of each tensor copy of ${V}$. As basis vectors of ${V^{\otimes n}}$, we have ${|v_{i_1}\dots,v_{i_n}\rangle}$. Each such basis vector corresponds to a partition ${\mu}$ by counting the number of times a certain ${v_k}$ occurs in it. Now, consider the projector onto the isotypic space of some ${\lambda}$ in Schur-Weyl duality. This can be given by picking a standard Young tableau ${T}$ of shape ${\lambda}$.

$\displaystyle P_T=(\sum_{\pi\in C(T)}\text{sign}(\pi) \pi)(\sum_{\pi\in R(T)}\pi)\,, \ \ \ \ \ (13)$

where ${C(T)}$ and ${R(T)}$ are subsets of permutations of ${S_n}$ which permute the integers in each row and column respectively. It can be seen that this projector has non-zero overlap with only those basis vectors whose corresponding partition ${\mu}$ is dominated by ${\lambda}$ (in the dominance order mentioned earlier). If not, then there would be two boxes in the same column which have the same integer and the projector would be zero. Therefore, we have that if ${\rho}$ is dominated by ${\bar\lambda}$, then Tr${(P_\lambda\rho^{\otimes n})}$ is non-zero. This is easy to bound by using standard bounds on the dimensions of these irreducible representations. $\Box$

5.4. Permanents, determinants and immanants

The last thing I want to mention is a way to describe immanants using Schur-Weyl duality. The permanent of an ${n\times n}$ matrix ${A}$ is defined as

$\displaystyle \text{Perm}(A)=\sum_{\sigma\in S_n}\prod_{i}A_{i\sigma(i)}\,. \ \ \ \ \ (14)$

In order to relate this to Schur-Weyl duality, let us rewrite in terms of an ${n}$ fold tensor power.

$\displaystyle \text{Perm}(A)=\sum_{\sigma\in S_n}\langle 1,\dots,n |A^{\otimes n}|\sigma(1),\dots,\sigma(n)\rangle\,. \ \ \ \ \ (15)$

Now recall that the projector onto the isotypic space of the trivial representation of the symmetric group is given by

$\displaystyle \Pi_{\text{tr}}=\frac{1}{n!}\sum_{\sigma\in S_n}\sigma\,, \ \ \ \ \ (16)$

where ${\sigma}$ is the representation of the symmetric group on ${V^{\otimes n}}$ (i.e., the action that permutes the tensor factors). Using this, we have

$\displaystyle \text{Perm}(A)=n!\langle 1,\dots,n|A^{\otimes n}\Pi_{\text{tr}}|1,\dots,n\rangle\,. \ \ \ \ \ (17)$

We can generalize this to other isotypic spaces of the symmetric group to get the determinant and the immanents.

$\displaystyle \text{Det}(A)=n!\langle 1,\dots,n|A^{\otimes n}\Pi_{\text{alt}}|1,\dots,n\rangle\,, \ \ \ \ \ (18)$

where ${\Pi_{\text{alt}}}$ is the projector into the isotypic space of the alternating irrep of ${S_n}$. For the immanants, we have

$\displaystyle \text{Imm}_\lambda(A)=\frac{n!}{d_\lambda}\langle 1,\dots,n|A^{\otimes n}\Pi_\lambda |1,\dots,n\rangle\,, \ \ \ \ \ (19)$

where ${\Pi_\lambda}$ is the projector onto the irrep space of ${\lambda}$ and ${d_\lambda}$ is the dimension of the symmetric group irrep ${\lambda}$.

Let us now work only with immanants (since they are more general). The last equation can be written as

$\displaystyle \text{Imm}_\lambda(A)=\frac{n!}{d_\lambda}\text{Tr}(A^{\otimes n}\Pi_\lambda |1,\dots,n\rangle\langle 1,\dots,n|)\,. \ \ \ \ \ (20)$

First, we can rewrite this by as

$\displaystyle \text{Imm}_\lambda(A)=\frac{1}{d_\lambda}\text{Tr}(A^{\otimes n}\Pi_\lambda \Pi_{1^n})\,, \ \ \ \ \ (21)$

where ${\Pi_{1^n}}$ is the permutation module corresponding to the partition ${1^n}$. It is the induced representation from the trivial representation of the Young subgroup corresponding to the partition ${1^n}$. To obtain the above equation, we have used the fact that

$\displaystyle \sum_\sigma \sigma|1,\dots,n\rangle\langle 1,\dots,n|\sigma^{-1}=\Pi_{1^n}\,. \ \ \ \ \ (22)$

Now we can also block diagonalize ${A^{\otimes n}}$ to obtain

$\displaystyle \text{Imm}_\lambda(A)=\frac{1}{d_\lambda}\text{Tr}((I_\lambda\otimes A_\lambda)\Pi_\lambda \Pi_{1^n})\,, \ \ \ \ \ (23)$

where ${I_\lambda}$ is the identity on the symmetric group space and ${A_\lambda}$ is the part in the unitary group irreducible space. This can finally be written as

$\displaystyle \text{Imm}_\lambda(A)=\text{Tr}(A_\lambda\Pi_{1^n,\lambda})\,, \ \ \ \ \ (24)$

where ${\Pi_{1^n,\lambda}}$ is the projection on to the multiplicity space of the ${S_n}$ irrep ${\lambda}$ in the permutation module ${\Pi_{1^n}}$. This space has dimension equal to the Kostka number ${K_{1^n,\lambda}}$.

If we replace the permutation module ${\Pi_{1^n}}$ by another one, then we get immanants of replicated matrices. First, define, for any type ${T=(i_1,\dots,i_n)}$ (where ${\sum_k i_k=n}$) the matrix ${A_T}$ obtained by repeating the first row and column ${i_1}$ times, repeating the second row and column ${i_2}$ times etc. Then the immanant of this matrix is

$\displaystyle \text{Imm}_\lambda(A_T)=\text{Tr}(A_\lambda\Pi_{T,\lambda})\,, \ \ \ \ \ (25)$

where ${\Pi_{T,\lambda}}$ is the projection onto the isotypic space of ${\lambda}$ in the permutation module corresponding to the type ${T}$. To see this, note that the immanant of this replicated matrix is

$\displaystyle \text{Imm}_\lambda(A_T)=\frac{n!}{d_\lambda}\langle i_1,\dots,i_n |A^{\otimes n}\Pi_T|i_1,\dots,i_n\rangle\,. \ \ \ \ \ (26)$

The rest of it follows in the same way as before.