## Monday, September 29, 2008

### Re-examing the Inner Product (Euclidean Space)

I'm auditing Kip Thorne's "Applications of Classical Physics" this year. Today was the first day of class, and Kip gave a preamble which, if I may be allowed to paraphrase in a wild, inaccurate, and completely unfair manner, (which is basically to say every word of this is existed only in my head) went something approximately like this:
Welcome to class. Here is the website. Please call me "Kip". I'm 68 years old and my signature lucky ponytail is a thing of the past. There's not enough time left for ridiculous formalities. I am about to retire. I'm going to start writing books and making movies and various other things that famous retired physicists can do. I'm too old to waterski, so this is the next best option. This is will be last class I ever teach. There will be no grades. I would prefer you actually learn some stuff. Now, let's suppose the laws of physics are frame-independent, and see what restrictions this places on force law dealing with a classical field in Minkowski space..."

Awesome.

One thing that interested me was our classes' definition of the inner product between two vectors. Since we wanted to do everything in a frame-independent manner, we couldn't simply define the inner product by

$\mathbf{A} \cdot \mathbf{B} = A_x B_x + A_y B_y + A_z B_z$

because that presumes the existence of basis vectors. The other common definition is

$\mathbf{A} \cdot \mathbf{B} = A B \cos{\theta}$

which is good, but we wanted something that could generalize to the Minkowski space of special relativity. So the definition we came in two parts. First, we defined the inner product of a vector with itself. Then use that definition to bridge to the inner product of arbitrary vectors.

For a vector $\mathbf{A}$, define

$\mathbf{A} \cdot \mathbf{A} = - \Delta s^2$

where $\Delta s^2$ is the square of the physical invariant interval between two events. To measure this interval, have an unaccelerated clock move from the origin to the event. ($\Delta s$ should be real, meaning the square of the interval is positive, and $-\Delta s^2$ is negative. So the inner product of the vector between timelike events with itself is negative). In the case of spacelike events, you should instead find the inertial reference frame in which the events are simultaneous, and lay down a measuring stick between them to get the interval.

Now we know how to take the inner product of a vector with itself. Define the inner product of two vectors $\mathbf{A}$ and $\mathbf{B}$ by

$\mathbf{A} \cdot \mathbf{B} = \frac{1}{4}\left( (\mathbf{A} + \mathbf{B})^2 - (\mathbf{A} - \mathbf{B})^2 \right)$

This is more subtle than it might initially appear. You can't just go willy-nilly with the algebra and start simplifying that right hand side out. You don't know any properties of this inner product, so you cannot, for example, write

$(\mathbf{A} + \mathbf{B}) \cdot (\mathbf{A} + \mathbf{B}) = A^2 + \mathbf{A} \cdot \mathbf{B} + \mathbf{B} \cdot \mathbf{A} + B^2$

Instead you have to work with the actual sums. It was claimed in class that this definition of the inner product is bilinear in the arguments, which wasn't obvious to me. So I asked about it, and Kip suggested I try to work it out for myself in Euclidean space using the Pythagorean theorem. So here goes.

Break down the vector $\mathbf{B}$ into a component parallel to $\mathbf{A}$ and a component perpendicular to $\mathbf{A}$

$\mathbf{B} = B_\parallel + B_\perp$

The vector $\mathbf{A} + \mathbf{B}$ now becomes $A + B_\parallel \mathbf{\hat{\parallel}} + B_\perp \mathbf{\hat{\perp}}$, where $\mathbf{\hat{\parallel}}$ and $\mathbf{\hat{\perp}}$ are unit vectors in the directions parallel and perpendicular to $\mathbf{A}$.

Because these two vectors make a right angle, the Pythagorean theorem applies.

$\left| \mathbf{A} + \mathbf{B} \right|^2 = (A + B_\parallel)^2 + B_\perp^2$

similarly,

$\left| \mathbf{A} - \mathbf{B} \right|^2 = (A - B_\parallel)^2 + B_\perp^2$

This is great, because the right hand sides of those equations are just numbers. Expand that out, subtract the two equations, and divide by four to obtain

$\frac{1}{4} \left| \mathbf{A} + \mathbf{B} \right|^2 - \left| \mathbf{A} - \mathbf{B} \right|^2 = A B_\parallel$

That's just the normal definition of the dot product, and is obviously linear in both vectors. Now, how about Minkowski space?

### Why Am I Posting This Crap?

Working with physical models inevitably leads to mathematical problems. In physics education, many professors build up enough tools and techniques to shove these problems aside, and then quickly move on to the real physics. That's simply a reflection of the fact that they care deeply about the physics itself, and I appreciate their efforts to keep students from getting bogged down in details. But taken to the extreme, this attitude produces calculating-maniacs with tons of knowledge and little mathematical intuition. Take at look at the canonical book on math methods for physics (Arfken), and you'll see that it's simply an encyclopedic dictionary of methods.

I'm only now becoming mature enough to go back and re-examine many of the things I learned earlier with an eye towards appreciation rather than computation. This practice intrigues me. The purpose of a blog is ostensibly to create content for public consumption. That's not what I'm doing. I'm working through things for myself. Not because I feel a dire need to understand. Because I can get enraptured by the absurd power of a little solid thought to unearth something I never saw before, and yet later becomes laughably obvious.

According to Google Analytics, in the last month, 58 total people have visited the site in 172 separate visits, and viewed 268 pages. They've spent a total of 5 hours, 41 minutes reading the posts. That's only a small fraction of the cumulative time it took me to write the posts. Also, a lot of the statistics I quoted are just me, logging in from various computers, trying to crank up the scores so I'll feel more important.

Nikita, Julia, and Kangway have actually been participating and giving me feedback, which is pretty much incredible, so thank you for doing that. But looking at the recent posts I've written, I realized a few things:
• not that many people care about things such as the geometric interpretation of a vector product rule
• people who do care were better students the first time around than I was, and mostly already know
• people who would care, but don't already know this stuff, most likely don't have the assumed background to learn much from my posts.
• even if the content of the posts is "just right" for you, I skip a bunch of steps and gloss over the parts of the arguments that are less interesting to me, because they're too tedious to draw/$\LaTeX$ for the blog. Hence, following the posts from scratch would be a rough road. It's not the sort of reading people are looking for when browsing through blog posts. They want quick, to-the-point, easily-comprehensible and entertaining content. This blog is the antithesis.
My goal is simply to explore nuances that interest me, however mundane. That explains a few of the points above. When I start writing a post, I have to think about how much background knowledge to assume the reader has. I want to write about the parts that I think are interesting, which means essentially the parts (and only the parts) that I didn't completely understand before I started working on the post. I generally assume that everything I am totally comfortable with, the reader is as well. That means people will frequently be flabbergasted by the strange logical jumps I take and the way I write certain details off as trivial. On the other hand, everything that I had to work out and think about for a while, I feel the need to explain in more detail than many readers would be interested in wading through.

Today I finished a post about vector products and determinants. I was aware of the geometric interpretations from a few sources, but I had never put the pieces together for myself. By the end, I was amazed. Not only had I built up the idea of a determinant, motivated geometrically, but I had also shown how to use the determinant to solve a system of equations. I didn't even know that would pop out of the argument. If you had asked me to demonstrate how a determinant works before I started writing, I probably wouldn't even have known. Additionally, the path that led me there was remarkably short, and since I had worked through it myself, it walks through a logical (to my mind) problem-solving process.

Math books tend to define and develop the determinant as a number characterizing a linear operator on an n-dimensional vector space. The treatment is more abstract. The proofs showing how to calculate it and how it solves systems of equations frequently rely on manipulating sums of indices and leave me feeling unsatisfied. I believe them, and see that they're correct, but I have difficulty visualizing them as well as the author of the book. I don't knock their approach. I think it's probably the best way. But this supplementary cogitation on the blog has bolstered my appreciation.

Physics books, on the other hand, will normally skip over the entire journey and roll the answer out right away, then tell you to start inverting some matrices. Save the math questions for your linear algebra class!

Writing this blog is my way of taking control of my own educational process. I think about whatever I want, at my own pace, in my own way. Why then post it on the internet? Anything I write has probably been written somewhere else, and probably with more insight and clarity than I'm providing. If my goals are personal education, why paste it all over this blog, advertising it for people to see?

If I were to keep these developments to myself, I wouldn't think them through fully. I'd work until I saw the gist of the argument, then say to myself, "the rest is just mopping up details from here. better move on to something more important."

If I don't take the time to mop up the details, the bits of understanding I did build flit away. So by assuming the responsibility to clean things up to the point where someone could potentially read through the post and work out the results for themself, I force myself to clean up the ideas in my own head. That's what I'm after.

### What does a determinant have to do with a cross product? pt. 2

In the previous post, I showed that the quantity

$\mathbf{A} \cdot \mathbf{B} \times \mathbf{C}$

and the determinant

$\left| \begin{array}{ccc} A_x & A_y & A_z \\ B_x & B_y & B_z \\ C_x & C_y & C_z \end{array} \right|$

are both tests that the three vectors $\mathbf{A}$ , $\mathbf{B}$, and $\mathbf{C}$ are independent. Now we'll go about showing that they're actually the same test - that they're the same number.

You may be familiar with the geometric interpretation of $\mathbf{A} \cdot \mathbf{B} \times \mathbf{C}$ as the volume of a parallelipiped (box) with sides $\mathbf{A}$ , $\mathbf{B}$ , and $\mathbf{C}$.

Just a quick explanation - $\mathbf{B} \times \mathbf{C}$ is a vector perpendicular to both, whose length is the area of the parallelogram they form. See stolen image below:

If I take that and dot it with $\mathbf{A}$ , I get the projection of $\mathbf{A}$ into the perpendicular direction, times the area of the box. That is, I get the height of the box times the area of its base. That's the volume. Wikipedia visualization:

There's one subtlety, which is that the volume of the box records its "handedness" in the sign of $\mathbf{A} \cdot \mathbf{B} \times \mathbf{C}$. By handedness, I mean that if we looked at a mirror image of the box, its volume would be multiplied by negative one. If you do that funny thing with your right hand, and find that $\mathbf{B} \times \mathbf{C}$ points the same sort of direction that $\mathbf{A}$ does, you have a righthanded box. Otherwise it's left-handed.

This interpretation of $\mathbf{A} \cdot \mathbf{B} \times \mathbf{C}$ as the volume turns out to be the same as the determinant. To see why, we'll make a list of properties that the area of the box has, and then show that those properties imply the normal calculation of the determinant.
• If the sides of the box are orthonormal (all perpendicular to each other, and all unit length), its volume is one (or minus one for a left-handed box).
• The volume of the box is linear in a vector. By this I mean that if we multiplied any side of the box by some constant, we'd multiply the volume of the entire box by that constant. Also, if we make two boxes that share two vectors but differ in the third, the sum of the volumes is the same as the volume of the box whose third side is the sum of the vectors. Notationally:
$( \alpha \mathbf{A} + \beta \mathbf{A'}) \cdot (\mathbf{B} \times \mathbf{C}) = \alpha \mathbf{A} \cdot \mathbf{B} \times \mathbf{C} + \beta \mathbf{A'} \cdot \mathbf{B} \times \mathbf{C}$
and similarly for $\mathbf{B}$ and $\mathbf{C}$
• If you switch the roles of any two vectors, you'll multiply the volume of the box by minus one. This is obvious if you flip$\mathbf{B}$ and $\mathbf{C}$, because then the angle between them gets measured the opposite way, so its sine is the opposite of what it used to be. The other possibilities you can visualize if you just play around with it a little. It works.
With these properties, we can uniquely determine the value of the volume of the box from its components, and show that it's the determinant. It's a little long for three dimensions, so you can work through that on your own if you want, after seeing it in two dimensions.

Imagine we have two 2-D vectors in component form, and we put them in a matrix. Then we ask for the determinant. The first property says the determinant of the identity is one.

$\left| \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right| = 1$

Using linearity in the vectors, we find

$\left| \begin{array}{cc} A_x & 0 \\ 0 & B_y \end{array} \right| = A_x B_y \left| \begin{array}{cc} 1 & 0 \\ 0 & 1 \end{array} \right| = A_x B_y$

Now we switch the vectors in the identity and multiply the determinant by -1.

$\left| \begin{array}{cc} 0 & 1 \\ 1 & 0 \end{array} \right| = -1$

and

$\left| \begin{array}{cc} 0 & A_y \\ B_x & 0 \end{array} \right| = -A_y B_x$

Finally use the linearity within a vector, which states
$\left| \begin{array}{cc} A_x & A_y \\ B_x & B_y \end{array} \right| =
\left| \begin{array}{cc} A_x & 0 \\ B_x & B_y \end{array} \right| + \left| \begin{array}{cc} 0 & A_y \\ B_x & B_y \end{array} \right| =
\left| \begin{array}{cc} A_x & 0 \\ B_x & 0 \end{array} \right| + \left| \begin{array}{cc} A_x & 0 \\ 0 & B_y \end{array} \right| + \left| \begin{array}{cc} 0 & A_y \\ B_x & 0 \end{array} \right| + \left| \begin{array}{cc} 0 & A_y \\ 0 & B_y \end{array} \right|
= A_x B_y - A_y B_x$

where we could drop the determinants of matrices with an entire column zero because they are dependent vectors.

That is the determinant for a 2-by-2 matrix. If you want to do it for 3-by-3, you can use the exact same process of breaking it down into 27 matrices with three nonzero entries each, and then see by permutations of the identity which ones are zero, and where the plus and minus signs go. Out will fall the normal expression for a 3-by-3 determinant.

So that's where that comes from.

#### Why it Matters

In the previous post, we were trying to solve the problem of taking three vectors and finding a linear combination of them that yielded a fourth vector. That is, find $a$, $b$, $c$ such that

$a \mathbf{A} + b \mathbf{B} + c \mathbf{C} = \mathbf{D}$

where $\mathbf{A}$, $\mathbf{B}$, $\mathbf{C}$, and $\mathbf{D}$ are all arbitary vectors.

We worked far enough just to test if the problem had a solution, but we found this result:

$a = \frac{D_\perp B C \sin{\theta}}{\mathbf{A} \cdot \mathbf{B} \times \mathbf{C}}$

From there, we need to find $D_\perp$ by dotting it with a unit vector perpendicular to both $\mathbf{B}$ and $\mathbf{C}$.

$D_\perp = \mathbf{D} \cdot \frac{\mathbf{B} \times \mathbf{C}}{B C \sin{\theta}}$

Plug this into the equation for $a$, and the stuff about $B C \sin{ \theta }$ cancels, leaving

$a = \frac{\mathbf{D} \cdot \mathbf{B} \times \mathbf{C}}{\mathbf{A} \cdot \mathbf{B} \times \mathbf{C}}$

Those are determinants. That's why they matter. They let you solve systems of linear equations, such as the system of linear equations that results from the problem we just solved.