Vandermonde matrix

In this post, we will compute a classical determinant called the Vandermonde determinant. Though the computation of this determinant looks intractable at first, there turns out to be a beautiful formula for it, with a very neat proof. Somewhat surprisingly, the matrices involved and this result are related to the notion of Lagrange interpolating polynomials.

Main result

Take some coefficients x_1,  \dots, x_n (in \mathbb{R} or \mathbb{C}, or more generally in any field). The associated Vandermonde matrix is the n \times n matrix given by

\displaystyle V_n = \begin{pmatrix} 1 & x_1 & \cdots & x_1^{n-1} \\ 1 & x_2 & \cdots & x_2^{n-1} \\ \vdots & \ldots & \ldots & \vdots \\ 1 & x_n & \cdots & x_n^{n-1} \end{pmatrix}

As we will see below, it naturally appears when we talk about polynomial interpolation, but less us just take it as a pretty object for now. There are plenty of things that we’d like to know about a matrix, one of them is its determinant. It is interesting to try and figure this out by hand first.

Use your favorite techniques to compute the determinant of V_n when n = 2 and n=3. Do it for n = 4 if you are extremely motivated.

Did I say interesting? I meant painful. One would expect that there might be a pattern for the determinant of such a matrix, but it seems quite hard to see when computing things by hand. Yet, there is indeed a very nice formula. The trick is that it is only nice when it is shown in a factorized form, whereas usual techniques to compute determinants give it expanded. It is worth noting in passing that there is no technique that gives a factorized determinant in general, since such a technique would in particular allow to compute eigenvalues (and thus solve polynomial equations) explicitly. Regardless, here is the formula.

The determinant of V_n is

\displaystyle \det V_n = \prod_{1 \leq i < j \leq n} (x_j - x_i).

A few sanity checks are de rigueur here.

  • If two of the coefficients x_i, x_j are equal, then the matrix has two equal rows, so it is not invertible, and the determinant is 0. On the other hand, the product has a factor x_j - x_i = 0, so it is 0.
  • If n = 1, the matrix is just the constant 1, so \det V_n = 1. On the other hand, the product is empty, so it equal to 1. It would be fair to argue that this is just a convention though.
Check that the formula of Theorem I.1. agrees with what you found in Exercise I.1. for n = 2 and n = 3.

Proof

There are many proofs of the formula, one of them involving classical row-reduction, but it is painful. The shortest and most pleasing one, in my opinion, uses polynomials. Consider indeed the matrix V_n, but call the last coefficient X instead of x_n:

\displaystyle V_n = \begin{pmatrix} 1 & x_1 & \cdots & x_1^{n-1} \\ 1 & x_2 & \cdots & x_2^{n-1} \\ \vdots & \ldots & \ldots & \vdots \\ 1 & X & \cdots & X^{n-1} \end{pmatrix}

It changes absolutely nothing, but it changes everything! Indeed, we now see that the determinant is a function of X, but even more, that it is a polynomial in X. To see this, it suffices to expand according to the last row. So let us call this polynomial P(X). Note the following points.

  • P(X) is a polynomial of degree n-1.
  • If X = x_1, \dots, x_{n-1}, then the matrix has two equal rows, so its determinant is 0, which means P(x_1) = \cdots = P(x_{n-1}) = 0.
  • By expanding according to the last row, we see that the coefficient of X^{n-1} is the minor

\displaystyle \begin{vmatrix} 1 & x_1 & \cdots & x_1^{n-2} \\ 1 & x_2 & \cdots & x_2^{n-2} \\ \vdots & \ldots & \ldots & \vdots \\ 1 & x_{n-1} & \cdots & x_{n-1}^{n-2} \\ \end{vmatrix}

that is, it is the determinant of order n-1, namely \det V_{n-1}.

Conclude to the value of the polynomial P and finish the proof.

But this tells us everything about the polynomial! It has degree n-1 and roots x_1, \dots, x_{n-1}, so it can be written

\displaystyle P(X) = c(X-x_1) \cdots (X-x_{n-1})

for some constant c. But expanding shows that c is also the coefficient of X^{n-1}, so c = \det V_{n-1}. We conclude that

\displaystyle \det V_n = P(x_n) = \det V_{n-1} (x_n-x_1) \cdots (x_n-x_{n-1})

and the formula follows by induction. Isn’t this neat? The amount of computation that we had to do is essentially zero, unlike when trying “by hand”.

Recover the value of the constant c by considering the constant coefficient of the polynomial P instead of the coefficient of X^{n-1}.

Application

On top of being a beautiful result, the formula of Theorem I.1. shows that the Vandermonde determinant is not zero when the coefficients x_1, \dots x_n are distinct. In particular, the matrix is then invertible. This means that the linear mapping that it corresponds to (in some given bases) is a bijection. Yet, if we think of the matrix as describing a map from \mathbb{R}^n to \mathbb{R}^n in its canonical basis, then there is nothing too exciting happening.

Let us instead think of polynomials. We all know that through two distinct points, there goes one and only one straight line (which is not vertical if the x-coordinates are different). More generally, through three points goes one and only one parabola (or a straight line), once again as long as the x-coordinates of these points are different. Even more generally, we would hope that if we take n points (x_1, y_1), \dots, (x_n, y_n) with all x_i distinct, then there is a unique polynomial P of degree n-1 (or less) that goes through all these points, that is

\displaystyle P(x_i) = y_i, \quad i = 1, \dots, n.

If we search P in the form P(X) = a_0 + a_1 X + \cdots + a_{n-1} X^{n-1}, then we want

\displaystyle a_0 + a_1 x_i + \cdots + a_{n-1} x_i^{n-1} = y_i, \quad i = 1, \dots, n.

But this is a system of equations, which, in matrix form, is exactly

\displaystyle \begin{pmatrix} 1 & x_1 & \cdots & x_1^{n-1} \\ 1 & x_2 & \cdots & x_2^{n-1} \\ \vdots & \ldots & \ldots & \vdots \\ 1 & x_n & \cdots & x_n^{n-1} \end{pmatrix}  \begin{pmatrix} a_0 \\ a_1 \\ \vdots \\ a_{n-1} \end{pmatrix} = \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix}

that is

\displaystyle V_n \begin{pmatrix} a_0 \\ a_1 \\ \vdots \\ a_{n-1} \end{pmatrix} = \begin{pmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{pmatrix}

But since we know that V_n is invertible, this means that this equation has a unique solution. This translates to the following result.

Assume that x_1, \dots, x_n are all distinct. Then for any y_1, \dots, y_n, there exists a unique polynomial P of degree \leq n-1 such that P(x_i) = y_i for all i = 1, \dots, n. This polynomial is called the Lagrange interpolation polynomial at these points.

Interpolation polynomials are used in a lot of different areas of math and applications, for example for approximating shapes, in prediction, signal processing, and so on. The Lagrange polynomials are the most basic ones, but suffer from some issues like Runge’s phenomenon.

Of course, the previous result is purely theoretical and does not tell us about what such a polynomial would be. As it turns out, it is actually not too hard to write the interpolating polynomial explicitly. We summarize this in the following exercise.

Consider n points (x_1, y_1), \dots, (x_n, y_n) with the x_i being all distinct.

  1. Argue that there exists at most one polynomial P such that P(x_i) = y_i for all i = 1, \dots, n.
  2. Find this polynomial if all the y_i are zero.
  3. Find this polynomial if y_i = 1 and y_j = 0 for j \neq i.
  4. Finally, find this polynomial for general y_i.
Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s