A Gentle Introduction to Tensors (2014) [pdf]
soVeryTired 2021-08-18 10:06:15 +0000 UTC [ - ]
For machine-learning types, a tensor is a multidimensaional array of numbers that admits certain operations (addition, subtraction, hadamard product, etc) [0]
For mathematicians, a tensor is a function F. We pass the function an ordered tuple of N vectors and M covectors (or, to keep things simple, N column vectors and M row vectors), and the function returns a scalar. The function F needs to be linear in each of the N vectors and M covectors. In this view, a matrix is a tensor with N=1, M=1. The operations used by machine learning types arise naturally once you crank the mathematical handle a little bit.
For physicists, a tensor is what mathematicians would refer to as a tensor field. So they take a space X (think R^N), and for each point in the space they associate a mathematician's tensor F as defined above. The properties of the function F are permitted to change from point to point. The Riemannian metric tensor is a classic example of 'tensor' according to this usage.
[0] https://machinelearningmastery.com/introduction-to-tensors-f...
ktpsns 2021-08-18 11:43:31 +0000 UTC [ - ]
For machine-learning and data science, tensors are largely what is called n-dimensional array. Numpy is a popular python library built basically around this data type.
Mathematical tensors in a particular base (sorry for the sloppy language, theoretical physicist here) can be displayed with such a data type, depsite the n-dimensional array data type (as available in numpy) lacks the (co)vector properties.
Funny enough, in theoretical physics there are two groups of people: The one prefering to write out complicated objects (i.e. tensors) with indices and the ones who do not (typically called abstract tensor notation in general relativity). The latter are much more close to mathematical physics and prefer to understand tensors as mappings from (co)vector spaces to scalars, similar to as you already wrote.
Most general relativity computer codes are written by the index people, not by the mathematical physicists ;-)
joppy 2021-08-18 11:57:34 +0000 UTC [ - ]
Tensor products in this generality don’t need to have an (n, m) rank in the sense you’re describing, since they might be put together from completely different spaces. For example it’s perfectly fine to form the tensor product of a 2-dimensional space with a 5-dimensional space, yielding a 10-dimensional space.
soVeryTired 2021-08-18 12:06:30 +0000 UTC [ - ]
I'm not familiar with the infinite-dimensional case. Do you have an example of a tensor that isn't a multilinear function on a product of vector spaces? I'd be interested to refine my understanding here.
joppy 2021-08-18 12:20:33 +0000 UTC [ - ]
A familiar example of a tensor product of countably infinite dimensional vector spaces would be polynomials in multiple variables. Say F[x] is the vector space of polynomials in x with coefficients in the field F, and F[x,y] is the space of polynomials in the variables x, y. For example x^2 - x is an element of F[x], and yx^2 - y^2 is an element of F[x,y]. Then it’s not hard to see that as vector spaces, F[x,y] is isomorphic to (F[x] tensor F[y]). A tensor in this new space is precisely a polynomial in two variables.
In the above it’s important to note that a polynomial has finitely many terms, so a power series like 1 + x + x^2 + … is not a polynomial. The space of power series is isomorphic to the space of linear functions on F[x], and is uncountably-infinite dimensional.
soVeryTired 2021-08-18 12:41:31 +0000 UTC [ - ]
I think I'm missing part of the argument. So we have a polynomial p in F[x,y]. The claim in my previous post basically says that there's always a way to associate p with a linear map that takes two elements of the dual space of F[x] and produces a real number. I don't see why that's impossible here.
joppy 2021-08-18 12:48:36 +0000 UTC [ - ]
So every polynomial could be represented as a linear function on polynomials, but not every linear function on polynomials is itself a polynomial.
joppy 2021-08-18 13:39:23 +0000 UTC [ - ]
In order to have V* = V for an infinite-dimensional vector space V, you need to redefine V^* to some kind of restricted dual, rather than defining it as the set of all linear functions. In the polynomial example, if we take the space of all linear maps g: F[x] -> F such that g(x^n) = g(x^(n+1)) = ... = 0 for some n >> 0, then this restricted dual is isomorphic to F[x] again. But the evaluation map g(f) = f(1) is not in this restricted dual.
There are more reasons why confusing a vector space with its dual is a bad idea. For example you cannot cook up a map V -> V* without extra knowledge, for example a choice of basis of V or something. There are many examples in abstract algebra where there is a perfectly good vector space V, and absolutely no good choice of basis for V, so trying to identify elements of V with V* is unnatural. We may still be able to speak perfectly well of vectors in V or V*, but trying to identify V with V* is still unnatural. A good example is V = (functions R -> R). I can speak easily of elements of V (for example, x + sin(x)), and of elements of V* (for example, f -> integral of xf(x)), but trying to figure out which element in the dual either of these corresponds to is hopeless. We're better off just accepting at some point that there is a real difference between a vector space and its dual.
soVeryTired 2021-08-18 14:26:05 +0000 UTC [ - ]
soVeryTired 2021-08-18 13:13:14 +0000 UTC [ - ]
So we're talking about the case when tensors don’t need to be multilinear functions on a product of vector spaces.
Each element of V* is (trivially) a tensor and a linear function on V. Each element of V is (trivially) a tensor and a linear function on V*. However, not all linear functions on V* are in V. So V** is bigger than V. No problem so far.
But all elements of V** are functions on (trivial) products of vector spaces, and by definition all functions in V** are linear. So how have we misunderstood each other here?
JadeNB 2021-08-18 11:51:29 +0000 UTC [ - ]
As with almost all non-technical statements one can make about math (including my amendment—recursive nerd-sniping away), this is neither quite true nor quite false.
A tensor is an element of a tensor product—that's it. (But what tensor product? There are a lot of notions. I'll use the bare tensor product of non-topologised vector spaces.)
One kind of thing you can tensor is copies of a single vector space, and/or its dual. This is the sort of tensor product you likely have in mind. Tensoring copies of the dual allows you to feed in elements of the original vector space. Tensoring copies of the vector space allows you to feed in elements of the dual vector space (there is mathematically no intrinsic difference between a vector and a co-vector; having a favourite vector space in mind, you can talk about the elements of that vector space or of its dual, and—which is where the vector vs. co-vector terminology in physics comes from—about how coefficient vectors change when you change the basis of V), although, importantly, if your original vector space is infinite dimensional then the natural embedding into its double dual is not an isomorphism.
Which is to say, yes, the objects you describe all arise as mathematical tensors; but mathematical tensors can also describe much more.
soVeryTired 2021-08-18 12:48:57 +0000 UTC [ - ]
Ok, so this feels like the crux of the matter. So how does the fact that V** is not isomorphic to V make the tensor product construction a more general concept than the linear function construction?
JadeNB 2021-08-18 17:01:08 +0000 UTC [ - ]
This doesn't really have anything to do with tensor products per se; it can already be seen with tensor products involving only a single factor, which are just vector spaces. Thinking of tensors only as functions means that, for example, one can never think of the original vector space V itself, only of its image in the double dual V^{**} (a fancy way of saying that you can evaluate a vector v \in V on an element v^* of the dual vector space V^* by evaluating v^* at v: in confusing but suggestive notation, v(v^*) = v^*(v)).
It's certainly true that you can do this, and, given the axiom of choice, you don't lose any information; you know everything about a vector v \in V by knowing its value on elements of V^* (which is to say, by knowing the values of elements of V^* on V). However, if V is infinite dimensional, then you are forcing yourself to carry around extra, possibly unwanted information: if you are taking bare algebraic duals, not topological duals, then V^{**} is inconceivably larger than V, which is to say that there are way more linear functionals on V^* than just those coming from evaluation at a fixed element of V.
You can fix some of this inconceivable largeness by knowing a little more structure carried by V, and by forcing your dual to reflect that structure—usually you know the topology, and ask that the dual consist of continuous functionals; and once there's topology, you start asking things of the tensor product, too. (For example, you probably don't want to take the vector-space tensor product of Hilbert spaces, but rather its completion in some suitable sense.) But, even with a more refined notion of duality, it's still only the nice spaces V that are identified with their double duals via the canonical map V \to V^{**}; the terminology is 'reflexive'.
(I think I caught all the asterisks that the Markdown parser ate the first time through.)
ksd482 2021-08-18 04:17:49 +0000 UTC [ - ]
There is ONE main thing I find lacking in all of these sources: computational examples/exercises.
My idea of a "gentle introduction to tensors" would be: motivation, definition and then immediately followed by computational problems (lots of them). Only then I would be comfortable with abstract definitions and proofs (which is my goal ultimately).
Edit: https://www.youtube.com/watch?v=5oeWX3NUhMA&list=PLFeEvEPtX_... is the most amazing lecture on Tensors I have watched so far, by far!
miles7 2021-08-18 03:40:03 +0000 UTC [ - ]
He takes the more geometrical perspective as a tensor as a multi-linear function of vectors, from which all other statements about tensors (eg how the components transform) follows straightforwardly. Lots of other great material in this book and best of all there are loads of examples.
lisper 2021-08-18 02:34:02 +0000 UTC [ - ]
A tensor of rank N is a vector in a space whose basis is a set of N-tuples of ordinary vectors. That's it. So a rank-1 tensor is just a regular vector. A rank-2 tensor is a vector in a space whose basis is ordered pairs of regular vectors, a rank-3 tensor is a vector in a space whose basis is ordered triples of regular vectors, and so on.
dreamcompiler 2021-08-18 04:09:25 +0000 UTC [ - ]
The dimensionality of a vector space is the number of scalars needed to build a vector in that space. That's easy.
I also think of "rank" as synonymous with "dimensionality" but it's not. Or at least it implies a different connotation of the word "dimensionality." It's the number of dimensions in the notation rather than in the space. Or something. And now I'm off the rails. And this is before I start trying to think about a basis being "ordered pairs of vectors" or tensor fields or about how tensors transform.
lisper 2021-08-18 06:10:33 +0000 UTC [ - ]
soVeryTired 2021-08-18 11:18:30 +0000 UTC [ - ]
To move from the mathematician's definition to the ML definition, pick a basis for your row and colum vectors. Now if you want the (i, j, ...) element of the multidimensional array, gather the vectors with one in the i-th position and zero elsewhere, the j-th position and zero elsewhere, etc. Then feed them to the tensor in that order.
It's easiest to see how it works with row vectors (rank (0,1)), column vectors (rank (1, 0)) and matrices (rank (1, 1)) and work from there.
joppy 2021-08-18 12:43:30 +0000 UTC [ - ]
If t is zero, then it rank is zero. If the tensor product is Vx(dual V), ie of type (n,m)=(1,1), then a tensor t can be considered as a matrix, and its tensor rank is the same thing as its matrix rank. You’re basically looking for the smallest way of writing the matrix as a sum of outer products of row and column vectors.
If you’re into quantum physics, then tensors of rank 0 or 1 are non-entangled, and tensors of rank 2 or more are entangled states.
725686 2021-08-17 23:53:49 +0000 UTC [ - ]
https://www.youtube.com/watch?v=f5liqUk0ZTw&t=3s
He also has some great "Student's Guide" books about tensors and other subjects.
sgregnt 2021-08-18 11:20:50 +0000 UTC [ - ]
glaucon 2021-08-18 03:48:05 +0000 UTC [ - ]
effthisis 2021-08-18 03:38:14 +0000 UTC [ - ]
gerdesj 2021-08-17 22:48:49 +0000 UTC [ - ]
The author wavers between I and we. Am I your friend, guiding you through the maze or are we lecturing you on something? The author needs to settle on either one identity or spell out when they become one or another.
Sometimes, you might wish to appear as a friend hovering over the shoulder and provide hints as to the right direction to follow and at other times you might deploy something that will give LaTeX a headache and pull out the Vox Dei stop on the 250 ton Organ and destroy nearby eardrums.
I love the paper and it is saved locally.
kowlo 2021-08-17 23:53:19 +0000 UTC [ - ]
Perhaps I'm finding camaraderie where there is none. How disappointing.
harry8 2021-08-17 23:55:09 +0000 UTC [ - ]
Is that bad luck on my part? (I've given up on thinking I'm particularly or especially dense - while making no claims to be some species of super-genius) Or is that the generally accepted genre?
Koshkin 2021-08-18 00:10:52 +0000 UTC [ - ]
Cornelius Lanczos
paulpauper 2021-08-18 01:05:53 +0000 UTC [ - ]
harry8 2021-08-18 04:01:09 +0000 UTC [ - ]
I haven't seen or read any that have the words in their title "A Gentle introduction..." It's those words I'm talking about as an indicator of being anything but what they claim in the general case. I'm very open to counter examples.
harry8 2021-08-18 04:07:26 +0000 UTC [ - ]
I mention this only because I don't think the comment should be flagged or dead. Ymmv.
SantalBlush 2021-08-18 00:16:54 +0000 UTC [ - ]
edit: For anyone who wants examples of good introductions, check out 3Blue1Brown's YouTube channel.
anon_tor_12345 2021-08-18 00:31:20 +0000 UTC [ - ]
I'm sorry that you're not mathematically literate but that is no reason to cast aspersions on expositors that have
1) zero obligation to cater to your needs
2) receive zero compensation for producing such expository pieces (since they don't count towards publication records)
>the generally accepted genre?
Indeed "gentle introduction" does not mean spoon food, it means exposition that includes niceties like motivation, examples, and diagrams. It is gentle relative to a research monograph (at which I'm sure your indignation would be astronomical).
Here's my recommendation to you on how to learn mathematics if you're serious about it but don't have the patience to struggle through "gentle introductions" like absolutely every single other practicing mathematician did: head down to your local university math department and ask about grad students willing to tutor. The going rate is ~$100/hr for the kind of spoon feeding you seem to be looking for. Quite steep I know but it's highly skilled labor after all (I'm sure you make about that as a software dev for whom this is above their skillset). But you'll be interested to that it's an inversely sliding scale for just how much spoon feeding is necessary (I personally go as low as $25 for very highly motivated students).
dang 2021-08-18 06:28:01 +0000 UTC [ - ]
harry8 2021-08-18 08:01:28 +0000 UTC [ - ]
You may know better for many reasons. I note our concerns and goals are not necessarily aligned. YMMV.
dang 2021-08-18 18:51:48 +0000 UTC [ - ]
Koshkin 2021-08-17 22:25:28 +0000 UTC [ - ]
But if you want to learn about tensors from how their coordinates transform, here’s a treat:
https://www.youtube.com/watch?v=CliW7kSxxWU
abdullahkhalids 2021-08-18 07:54:48 +0000 UTC [ - ]
The main problem with most introductions to the topic are that they deal with coordinate systems where the basis vectors are orthogonal, so the covariant and contravariant components are the same. You need to deal with non-orthogonal basis vectors, because then you realize that naturally there are two ways of defining basis vectors at a given point in space. And naturally these two different sets of basis vectors have different transformation properties under geometric transformations. This book [https://danfleisch.com/sgvt/] does a decent job of getting you started on this approach - of course, it has no rigor.
thraxil 2021-08-18 08:31:46 +0000 UTC [ - ]
(Fleisch's "Student's Guide to Waves" is also highly recommended as a book I wish I'd had when I was a student).
JadeNB 2021-08-18 11:54:00 +0000 UTC [ - ]
Or, preferably, not with basis vectors at all, but that notion seems to do violence to a physicist's way of thinking about linear algebra—I've never quite understood why the physical approach to the subject is so bound up in co-ordinates, when it seems like physicists would be one of the groups most likely to benefit from fully grokking a co-ordinate-free approach.
abdullahkhalids 2021-08-18 12:32:57 +0000 UTC [ - ]
JadeNB 2021-08-18 12:44:50 +0000 UTC [ - ]
Of course, at some point, you need numbers! But there's no reason that those numbers need to infest the whole computation; you can de-coordinatise as soon as possible and re-coordinatise as late as possible, and in between you not only can but must think in a coordinate-free fashion.
(I am a mathematician, not a physicist, and am not presuming to tell physicists how to do their job—this isn't an argument about whether they should do things this way. I'm just pointing out that they could, and that it seems not only possible but advantageous. But of course long-established knowledge of what actually works in physics beats this tyro's guess at what could.)
maze-le 2021-08-18 07:54:59 +0000 UTC [ - ]
andi999 2021-08-18 08:30:04 +0000 UTC [ - ]
tmearnest 2021-08-18 01:17:06 +0000 UTC [ - ]
hrwvjigegby 2021-08-18 02:09:36 +0000 UTC [ - ]
Koshkin 2021-08-18 02:13:17 +0000 UTC [ - ]
https://www.youtube.com/watch?v=4l-qzZOZt50
(You might get more mileage if you also watch the preceding lecture(s) of this excellent series.)