-
Notifications
You must be signed in to change notification settings - Fork 0
/
appendix-tensor-2.tex
105 lines (84 loc) · 6.87 KB
/
appendix-tensor-2.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
\section{Brief primer on tensors}\label{sec:tensor}
The material here is based on \cite{lee-book-2000,dullemond-1991-tensor}.
Let \(V\) be an \(n\)-dimensional vector space over the reals. A {\em covector} on \(V\) is a real-valued linear functional \(\omega:V\rightarrow \Re\). The space of all covectors is itself a real vector space under pointwise addition and multiplication. It is written as \(V^{\star}\) and called the {\em dual space} to \(V\).
\begin{proposition}[\cite{lee-book-2000} Proposition 4.1] For \(V\) an \(n\)-dimensional vector space and \(E_1, \ldots, E_n\) a basis for \(V\), the covectors \(\epsilon^1, \ldots, \epsilon^n\), defined by:
\[
\epsilon^i(E_j) = \delta^i_j = \left\{
\begin{array}{ll}
1 & \mbox{if \(i =j\)} \\
0 & \mbox{if \(i\not=j\)}
\end{array}\right.
\]
\noindent form a basis for \(V^{\star}\), called the {\em dual basis} to \((E_i)\).
\end{proposition}
%% TODO: Define canonical isomorphism.
Note that it follows that the dimensionality of \(V^{\star}\) is the same as that of \(V\). It is also the case that \(V^{\star\star}\) is canonically isomorphic to \(V\).
\begin{definition}[Tensor] For $V$ a finite-dimensional real vector space, a {\em tensor} of rank \((k,l)\) over $V$ is a real-valued multilinear function of $k$ elements of \(V^{\star}\) and \(l\) elements of \(V\). \(k\) is said to be its {\em contravariant} rank, and \(l\) its {\em covariant} rank. The {\em rank} of the tensor is \(k+l\).
\end{definition}
A matrix \(M\) is a \((1,1)\)-tensor, a vector \(x\) a \((1,0)\)-tensor, and a covector \{a\} a \((0,1)\)-tensor.
Let \(\{{\bf e}_j\}\) be a basis for \(V\), with canonical cobasis \(\{\epsilon^i\}\) for \(V^{\star}\).
Such a tensor \(T\) can be associated with an array with \(k+l\) dimensions, whose \((i_1, \ldots, i_k, j_1, \ldots, j_l)\)'th element written \(T^{i_1\ldots i_k}_{j_1, \ldots, j_l}\) is given by \(T(\epsilon^{i_1}, \ldots, \epsilon^{i_k}, {\bf e}_{j_1}, \ldots, {\bf e}_{j_l})\).
The meaning of the indices of a tensor is usually decided beforehand; it is conventional to write a tensor with its contravariant (upper) indices first and the covariant (lower) indices last, e.g.{} \(T^{i,j}_k\). For instance, a matrix \(M\) is a \((1,1)\)-tensor, and we can write \(M_i^j\) to make its indices explicit. This is the same as \(M_j^k\) -- the identity of indices does not matter. Also, basis vectors will be written with a lower index, and components of a vector with respect to this basis with an upper index; similarly basis covectors will be written with an upper index and components of a covector with a lower index.
Denote the set of all \((k,l)\)-tensors over \(V\) by \(T^k_l\).
$T^k_l(V)$ is a vector space under point-wise addition and scalar multiplication. Specifically:
\[
\begin{array}{l}
(\alpha T)(X_1, \ldots, X_k, Y_1, \ldots, Y_l) = \alpha(T(X_1, \ldots, X_k, Y_1, \ldots, Y_l))\\
(S+T)(X_1, \ldots, X_k, Y_1, \ldots, Y_l)=S(X_1, \ldots, X_K, Y_1, \ldots, Y_k) + T(X_1, \ldots, X_k, Y_1, \ldots, Y_l)
\end{array}
\]
\subsection{Tensor Contraction}\label{sec:summation-convention}
In the following we shall adopt the {\em Einstein summation convention}:
\begin{quotation}
If the same index name appears twice in any term, once as an upper index and once as a lower index, that term is understood to be summed over all possible values of that index, generally from \(1\) to the dimension of the space in question. If the indices belong to different tensors, their tensor product is taken first.
\end{quotation}
Thus for instance we can rewrite:
\[
\begin{array}{lcl}
\Sigma_{\nu=1}^n A_{\mu\nu}v^{\nu} & \rightarrow & A_{\mu\nu}v_{\nu}\\
\Sigma_{\beta=1}^n \Sigma_{\gamma=1}^n A _{\alpha \beta}B^{\beta \gamma}C_{\gamma \delta} & \rightarrow & A_{\alpha \beta}B^{\beta \gamma}C_{\gamma \delta}
\end{array}
\]
The {\em contraction} of a tensor is obtained by setting one upper and one lower index equal, thus indicating a summation per the convention above. The result of contracting a \((k,l)\)-tensor is a \((k-1,l-1)\) tensor.
The contraction operation is invariant under coordinate changes.
\subsection{Working with tensors}
Tensors can be ``partially'' evaluated. For \(T\) a tensor of rank \(k\) representing a predicate \(p\), and \(a\) a vector representing the value of argument \(i\), \(T_{1\ldots k}a^i\) represents the predicate \(\lambda x_1, \ldots, x_{i-1},x_{i+1},\ldots x_k. p(x_1, \ldots, x_{i-1}, a, x_{i+1}, \ldots, x_k)\). It can be thought of as {\em contracting} \(T\) and \(a\) on index \(i\).
\begin{example}\label{ex:tc-1}
Consider the tensor representation of the boolean polynomial \(p(x,y) = x(1-y)\). Because \(x,y\in \{0,1\}\) with the two values independent of each other, we will embed them in the two dimensional vector space \(U=\{0,1\}^2\), with ``one hot'' basis vectors \({\bf e}_1=(1,0)\) and \({\bf e}_2=(0,1)\) representing \(0\) and \(1\) respectively. Now the tensor \(T_{1,2}\) representing the binary predicate \(\lambda x,y. p(x,y)\) is represented by the table:
\[
\begin{array}{l}
T({\bf e}_1,{\bf e}_1)=(0)(1-0)=0\\
T({\bf e}_1,{\bf e}_2)=(0)(1-1)=0\\
T({\bf e}_2,{\bf e}_1)=(1)(1-0)=1\\
T({\bf e}_2,{\bf e}_2)=(1)(0)=0
\end{array}
\]
We can use this representation to evaluate the predicate at different points through tensor contraction. For instance, \(p(0,1)\) is given, for \(a={\bf e}_1,b={\bf e}_2\) by \(T_{1,2}a^1 b^2\), which through the Einstein convention expands out to:\footnote{Recall that for a vector \(x\) \(x^j\) represents its \(j\)th coordinate. Note that
in the term \(T_{1,2}a^1b^2\) the super-scripts of \(a\) and \(b\) are indices, whereas in
\(T({\bf e}_i,{\bf e_j})a^i b^j\) the super-scripts of \(a\) and \(b\) are component selections.}
\[\begin{array}{ll}
T_{1,2}a^1 b^2 &= \Sigma_{j=1}^2 \Sigma_{i=1}^2 T({\bf e}_i,{\bf e_j})a^i b^j \\
&= T({\bf e}_1,{\bf e}_1)(1)(0) + T({\bf e}_1,{\bf e}_2)(1)(1) +
T({\bf e}_2,{\bf e}_1)(0)(0) + T({\bf e}_2,{\bf e}_2)(0)(1) \\
& =T({\bf e}_1,{\bf e}_2)\\
& = 0
\end{array}
\]
\end{example}
More generally, we can compute tensor contraction symbolically. Let \(x,y\) be unknown vectors in \(U\), and let \(S\) be a rank-2 tensor over \(U\). Then if \(S\) represents the predicate \(q\), the predication \(q(x,y)\) is represented by \(S_{1,2}x^1 y^2\) which expands out to:
\[\begin{array}{ll}
[S(x,y)]&\defeq S_{1,2}x^1 y^2\\
& = \Sigma_{j=1}^2 \Sigma_{i=1}^2 S({\bf e}_i,{\bf e}_j)x^iy^j \\
& =S({\bf e}_1,{\bf e}_1)x^1y^1 + S({\bf e}_1,{\bf e}_2)x^1y^2 +
S({\bf e}_2,{\bf e}_1)x^2y^1 + S({\bf e}_2,{\bf e}_2)x^2y^2 \\
\end{array}
\]
\begin{example}[Example~\ref{ex:tc-1} contd]
Taking the value of \(S\) above to be \(T\), we get:
\[\begin{array}{ll}
[T(x,y)] & = x^2y^1\\
& = [x=1][y=0]
\end{array}
\]
\end{example}
The representation of an arbitrary $k$-are predicate is similar, a tensor \(T_k\) represented by \(n^k\) numbers, for \(U=\{0,1\}^n\).