How Changing Your Coordinate System Simplifies Complex Matrix Operations

1 Introduction
Linear transformations and Change of Basis are deceptively simple concepts that tempt us to rush over them quickly. But this shortcut is the perfect way to frustration. The gaps in our understanding will backfire and cost us valuable time later. There is no free lunch; we have to take the time to build a clear understanding of their difference and how they relate to each other. It is absolutely fundamental for understanding more advanced topics like Diagonalization, SVD, and PCA. Let’s clarify this critical duality once and for all (this article is also available here).
In school or university we may have learned that a matrix is a rectangular array of objects of the same type, usually numbers, — a spreadsheet without the headers. While this is true, it’s not very helpful in actually understanding matrices. Depending on the context, a matrix A can represent very different things. It can be a dataset, a system of equations, or, most importantly for this article, a geometric operator. Specifically, we are going to look at the two ways to interpret the matrix-vector product Ax:
- As a Linear Transformation (The Active View): The matrix acts as a function that does something to the vector x. For example it can rotate the vector, or stretch it etc. But it can also map a vector to a different vector space. We will learn what that means soon. A linear transformation does not need to be invertible.
2. As a Change of Basis (The Passive View): The vector x as such is not changed, instead the matrix translates the vector’s coordinates into a new reference frame (grid), thus it just gives us the same vector but expressed differently. It will always stay in the same vector space and it must be invertible.
An analogy (slightly imperfect) would be this: Imagine you have a couch in your living room with its center at position (x,y) measured from the lower-left corner of the room. Now, let’s say you stretch the couch out. The couch has changed, it was stretched. That’s a linear transformation.
Next you want to measure the couch’s center again, but this time you use the lower-right corner of the room as your reference point, instead of the lower-left. The couch’s position has not changed, only its location is expressed differently. That would correspond to the change of basis transformation.
Just a short note on notation before we start:
• Bold capitalized letters denote matrices, such as A.
• Bold lower case letters denote vectors, such as x.
2 Vector Space and Basis
Before we can understand either linear transformation or change of basis, we must clarify the terms vector space and basis.
2.1 What is a Vector Space?
A vector space (V) is a set of objects (vectors) that is closed under addition and scalar multiplication. This means that you can take any number of elements from the space, add them together, or multiply them by any scalar (a single number), and the result will always remain inside that same space. For example, let’s assume our vector space V is ℝ² (the 2-D plane containing all vectors with a real x and y component), then we can add two 2-D vectors, and the result is still a 2-D vector. The same holds true for the requirement of scalar multiplication: If we multiply a 2-D vector by any number (stretching or shrinking it), it remains a 2-D vector.
When we combine these two operations, we get a linear combination. A linear combination of elements — in this case vectors — is simply a sum of vectors where each vector is scaled by some number. This leads us to the ”Closure” definition:
A vector space is a set where any linear combination of its vectors remains
within the set.
(Vector spaces do not necessarily need to contain column vectors, although this is what we implicitly assume in this article. A vector space is very generic, it can consist of matrices, functions, polynomials, etc. For detailed information consult a standard textbook, like Gilbert Strang’s [2] p. 123)
2.2 What is a Basis?
Previously, we defined a vector space by the rule that specifies it (closure). Now, let’s take a generative perspective. Instead of listing every single vector (which would be impossible), we can describe a vector space by its ”building blocks” — the minimal set of ingredients required to construct the entire space. For V = ℝ² , we essentially only need two specific vectors, often denoted as e₁ = [1, 0]ᵀ and e₂ = [0, 1]ᵀ . Notice that starting from these two vectors, you can create any other 2-D vector by combining them through addition and scalar multiplication. For example, the vector v = [3, 2]ᵀ is simply: v = 3 · e₁ + 2 ·e₂
With this we can define as basis as:
A basis for a vector space is a set of linearly independent vectors that can span the entire space. ([2] p. 164.)
Two or more vectors are linearly independent when no vector can be expressed as a linear combination of the other(s). Basically this means that we are looking for the minimal number of vectors that generate (=span) the space. Any two bases for the same vector space have the same number of vectors. This unique number is called the dimension of the space, e.g. we saw above that a basis for ℝ² is [1, 0]ᵀ , [0, 1]ᵀ . Thus, since this set contains two vectors the dimension of the space is 2.
The simplest basis for any vector space ℝⁿ is the standard basis (or canonical basis), often abbreviated by a capital I. and it consists of n vectors (e₁, e₂, ..eₙ). Each is a unit vector (length of 1) and is orthogonal (perpendicular) to the others. The i-th vector, eᵢ, has a value of 1 at the i-th position and 0 everywhere else. Thus, [1, 0]ᵀ , [0, 1]ᵀ which we’ve used previously is the standard basis for ℝ².
It is very important to note that that a basis for a vector space is not unique!
There are usually infinitely many bases for a given vector space. For example, the two vectors b₁ = [1, 1]ᵀ and b₂ = [−1, 1]ᵀ constitute an equally valid basis for ℝ² and any multiple of these two vectors would also be a valid basis.
3 The Change of Basis
We have seen that a vector space has various bases. It is crucial to understand that a vector x existing in the space is defined by a unique linear combination of the basis vectors (for a proof see [2], p. 168). For example, if x ∈ ℝ² , we can express x in terms of the standard basis by the unique linear combination x₁e₁ + x₂e₂ , where x₁, and x₂ denote the first and second component of x (note, that they are not in bold). These coefficients will be called the coordinates of the vector. Thus, the basis vectors define the coordinate system. We have already given an example of this earlier, where we stated that the vectorv = [3, 2]ᵀ is simply:
v = 3 · e₁ + 2 · e₂
Thus, (3,2) are the coordinates of the vector in the standard basis. This idea is illustrated in Figure 1.
![Figure 1: The coordinate system is defined by the green basis vectors e₁ and e₂. The vector x = [3, 2]ᵀ can be represented as the a linear combination of the basis vectors: three times e₁ (dotted red line) plus 2 times e₂ (dashed red line) results in the blue vector.](https://cdn-images-1.medium.com/max/720/1*eLccEETmIbsbrXn4tqLicw.png)
The vector x = [3, 2]ᵀ can be represented as the a linear combination of the basis vectors:
three times e₁ (dotted red line) plus 2 times e₂ (dashed red line) results in the blue vector.
Realize that with this we can write the vector x as a matrix-vector product, where the matrix consists of the basis vectors and x of the coordinates for that particular basis.

It is very important to note, that the same vector would have different coordinates if expressed in a different basis.
This is the entire point of the change of basis transformation — calculating a vector’s new coordinates with respect to a new basis (=coordinate system).
From this we can already deduce the following:
• A change of basis always takes place within a vector space.
• It must be an inversible process. If we can express a vector in terms of basis B₁ then we must be able to express it in terms of another basis for the same space B₂ , and we must be able to go back and forth between these two representations.
Now that we have covered all the prerequisites doing the math will actually be very easy:
Let’s assume we have a vector x expressed in the standard basis (I). We want to find its new coordinates, k, with respect to a new basis B = {b₁, b₂, . . . }. (Don’t confuse B₁ and b₁. The capital letter denotes a basis, written as a matrix, whereas the lowercase letter denotes a column vector of a basis.) We know that the vector x must be the same regardless of the basis used to describe it. (Think of the couch from the introductory example. The couch’s position stays the same.) This information is crucial. We also know that the representation of x in the basis I is as follows:



This can be written more compactly as:

(If the last step is not clear to you, you may want to brush up your knowledge on matrix multiplication, for example here). Now, all we need to do to compute the new coordinates k is to multiply both sides of the equation from the left with the inverse of B.

(Whenever we deal with the inverse of a matrix we must ask ourselves if the inverse even exists. Just like when we divide by a number, we have to make sure, that this number is never zero. Not every matrix has an inverse. In this case however, it is guaranteed that there is an inverse, because the basis of a vector space is defined to consist of linearly independent vectors. And such a matrix is said to have full rank and is always invertible.)
Thus, to compute the new coordinates we just need to multiply the vector x by its own basis and then from the left with the inverse of the new basis.
In the example we assumed that the input basis was I, the standard basis. This does not have to be the case. It can be any basis, and we should express this by giving the input basis another name; let’s call it V.
The general formula to do the Change of Basis is therefore:

However, in the special case that the input basis equals the standard basis we have that Ix = x. In this case, computing the new coordinates reduces to;

But how does this differ from a linear transformation? Note that B⁻¹ is just a name that we gave that matrix. We could as well have called it A in which case it would look exactly like how we will described the linear transformation, Ax (see next section). The important difference is that
we are not dealing with any matrix, but the inverse of a basis. Therefore, when we see the expression Ax and we know that A is an inverted basis and we know that x was given in terms of the standard basis, then we are now able to understand the meaning of this operation. It gives us the coordinates of x in the new basis. Let’s summarize this in a picture.

various bases, e.g. B₁ and B₂ denoted by different colors. Shown is one vector x₁ in this
vector space that is expressed in terms of each of the two bases, indicated by the according
colors, thus x₁ in blue indicates that the vector’s coordinates are given in the terms of B₁
and x₁ in red is the same vector but given in the coordinates of B₂. The blue boxes show
the operation with which we can convert between the different representations.
We will now see how the change of basis differs from a linear transformation.
4 The Active View: Ax as a Linear Transformation
A linear transformation is a function T that maps vectors from an input space V to an output space W . Every linear transformation can be represented by a matrix A with respect to chosen bases ([2] p. 405).
We can see how a matrix maps between spaces by considering the matrix-vector product. When we multiply a vector x ∈ ℝⁿ by a matrix A ∈ ℝᵐˣⁿ, the result is another vector, y ∈ ℝᵐ. Note, that the input dimension n may differ from the output dimension m, meaning x and y may exist in different vector spaces (i.e., ℝⁿ ≠ ℝᵐ ). (Recall that the definition of a vector space is closure under addition and multiplication. Thus, there is no linear combination of vectors in a vector space of dimension n that could transform them into a vector of another dimension. E.g. if n = 2 there is no linear combination of the standard base vectors [1, 0]ᵀ and [0, 1]ᵀ that would transform them into a vector of dimension n = 3, like [1, 0, 0].)
The linear transformation can map to an output space that is different from the input space, but it doesn’t have to. It can also be the case that in- and output space are equal. In this case it would be a transformation within the same space, in this case n = m.
For a transformation T to be linear, it must satisfy the following condition for all vectors u, v ∈ V and all scalars c:

This condition implies the core geometric properties of linear transformations:
• Lines remain lines after the transformation.
• The origin remains unchanged.
• Equally spaced points stay equally spaced.
It is easy to see — by applying rules of matrix-vector multiplications — that for all vectors u, v ∈ ℝⁿ and all scalars c the following holds:

This is precisely the definition of a linear transformation! Thus, Ax can be interpreted as a linear transformation, where the matrix A actively moves the input vector x to a new position within the fixed coordinate system.
As examples, in Figure 3 you can see a few common linear transformations in 2-D. Those were built with the following matrices applied to the vector [2, 1]:
• Rotation (90 degrees counter-clockwise):

•Scaling:

• Shearing:
![Shearing matrix in 2-D applied to [2,1]ᵀ.](https://cdn-images-1.medium.com/max/720/0*-wEya-VHT5wRhw6b.png)
![Figure 3: Linear transformations applied to x = [2, 1]]ᵀ . From left to right: scaling, rotation, shearing.](https://cdn-images-1.medium.com/max/1080/1*RA5YG8rdHb-t53cnAurIGg.png)
rotation, shearing.
Now we come to the most important part: How is the linear transformation actually accomplished? For that we’ll consider the image with the vector x = [3, 2]ᵀ from before but we will compare it to a rotated version of it. For that we will multiply x by a rotation matrix that rotates it counterclockwise by 45 degrees:

Now, note the following: Recall from the previous discussion that a vector x is a linear combination of basis vectors multiplied by its coordinates: x = Bk. When using the standard basis I, the coordinates k are simply the components of x, so x = Ix. We can now use the associative property of matrix multiplication to re-group our expression:

This seemingly subtle rearrangement reveals a profound insight: we can first apply the transformation matrix A to the basis vectors. Let’s substitute C = AI, where the columns of C are the transformed basis vectors cᵢ . We then get:

What this clearly shows is that the transformation is accomplished in two steps: first, we apply the transformation matrix A to the basis vectors (resulting in the matrix C), and second, we compute a linear combination of these new basis vectors, cᵢ, using the original coordinates xᵢ from x. The geometric interpretation is simple (see Figure 4 right): thus: to transform a vector, it suffices to transform the basis vectors first, and the transformed vector is simply the same linear combination of the new basis vectors.

The vector x = [3, 2]ᵀ can be represented as the a linear combination of the basis vectors:
three times e₁ (dotted red line) plus 2 times e₂ (dashed red line) results in the blue vector.
Right: Shown is the rotated version of the left figure plotted on top of the original (shown
in pale colors). The new coordinate system is defined by the green basis vectors which are
the rotated versions of the original basis vectors. Note that the rotated vector originally at
(3,2) can now be represented as the sum of the rotated basis vectors: three times the first
rotated basis vector (dotted red line) plus 2 times the second rotated basis vector (dashed
red line) is the rotated blue vector.
Now, what is the connection to a Change of Basis? Note that if the transformation matrix A is an invertible matrix (like a rotation matrix), then multiplying it by a basis matrix B will result in another valid basis C = AB. In this special case (where A is invertible and V = W), the operation Ax acts geometrically like a change of coordinates from the C-basis to the standard basis. However, a linear transformation is fundamentally different from a change of basis: unlike a change of basis, the transformation matrix A is allowed to be non-invertible (like a projection), and it can change the dimension by mapping a vector from ℝⁿ to a different space ℝᵐ.
We can now complete our conceptual map by adding the role of the linear transformation in Figure 5.

mation. There are several vector spaces, indicated by V and W. Each vector space has
its own bases, denoted as Bᵢ for V and Cᵢ for W. Each space contains vectors that are
expressed in terms of each of the different bases, indicated by the according colors, thus
x₁ in blue indicates that the vector’s coordinates are given in the terms of B₁ and x₁ in
red is the same vector but given in the coordinates of B₂ . In the space W the vector k₁
occurs twice, each time expressed in a different basis, and there is an additional vector k₂ .
The green boxes show a linear transformation. The box Bk₂ with the according arrow
indicates that we can compute x₁, a vector in a different vector space by means of a linear
transformation. The second green box Ak₁ indicates that A allows us to compute a new
vector k₂ by transforming k₁.
5 Change of Basis vs Linear Transformation
This section summarizes the key distinctions and commonalities between a linear transformation and a change of basis operation. Understanding these differences is crucial for grasping how diagonalization works. Please refer also to Figure 5 for a visual overview.
What They Have in Common: Both operations are represented by matrix multiplication and both rely on the principle of linearity, satisfying the condition:

How They Differ: This is summarized in Table 1.

6 Diagonalization: Where Change of Basis and Linear Transformations Converge
Let’s explore the powerful relationship between linear transformations and change of basis. They work together in the fundamental process of diagonalization. The primary goal here is not a precise mathematical derivation, but to build an intuitive, conceptual understanding of this process. We will introduce a few key terms without detailed explanation, as we will focus on the big idea.
First of all, what is diagonalization? Diagonalization, when possible, allows us to rewrite a square matrix A as a product of three matrices:

This formula is key to the entire process:
• D is a diagonal matrix. Diagonal matrices are highly desirable because they lead to greatly simplified computations (e.g., raising a matrix to a power).
• P is an invertible matrix whose columns form a special basis for A, known as the eigenbasis (we’ll save the details on this for a later article).
• P⁻¹ is the inverse of the matrix P.
Why would we want to diagonalize a matrix at all? Consider a very large and complex matrix whose application to vectors would be computationally expensive. We need a way to simplify this process.
The core concept of diagonalization is to view the linear transformation T(x) = Ax from a different basis (a ”different angle”) where the transformation appears much simpler — as a simple scaling operation.
When we apply A to a vector x using the diagonalization formula, the expression becomes:

Let’s dissect this expression to reveal the process.
The formula A = PDP⁻¹ represents a transformation that breaks the complicated action of A down into three linked, simpler operations:
1. Change of Basis (Go In): First, we use a change of basis matrix (P⁻¹) to transform the input vector x from the standard basis coordinates into the coordinates of A’s special eigenbasis.

2. Simple Transformation: We then apply the now-simple diagonal matrix (D) to the new coordinates. Applying a diagonal matrix only involves a simple scaling of the components, making it computationally cheap.

3. Change of Basis (Go Out): Finally, we use the original basis matrix (P) to
transform the result back from the eigenbasis coordinates into our standard, familiar coordinate system.

Putting it all together, we first express x in the easy-to-work-with eigenbasis, apply the simple transformation D, and then return to the original basis.
Congratulations 🎉! Now you have a conceptual understanding of the deeper meaning behind the diagonalization expression A = PDP⁻¹.
7 Summary and Outlook
It’s time to tap yourself on the back! You’ve successfully navigated and wrestled with some of the most important foundational concepts in linear algebra: linear transformations, change of basis, and how they harmoniously work together in the critical application of diagonalization. By building this strong conceptual understanding, you’ve done more than
just complete an article — you’ve given yourself a powerful head start. This foundational knowledge is not just a prerequisite; it’s the bedrock upon which nearly all advanced data science methods are built. Concepts like Singular Value Decomposition (SVD) and its practical application, Principal Component Analysis (PCA), rely heavily on the principles you’ve just mastered.
Let’s quickly reinforce the central dual concepts that this article explored. You’ve established a strong foundation by understanding that
• A vector space, V, is a set of vectors closed under addition and scalar multiplication (linear combinations).
• A basis is the minimal, linearly independent set of vectors that can span the entire space. It defines the coordinate system for expressing any vector x.
You have understood and contrasted the linear transformation and the change of basis. This was the main part of this article. Please refer to Table 1 and Figure 5 for a concise summary. Finally you have gained a conceptual understanding of the process of matrix diagonalization which is a powerful concept that uses both operations together to simplify the linear transformation A. Its goal is to move a vector into a different, special coordinate system in which the desired transformation is greatly simplified. After applying the transformation which is defined by a diagonal matrix the result is transformed back by the change of basis operation to the original basis.
Take a moment to review these conceptual steps. When you’re ready, the next article will delve deeper into the mechanics of eigenvalues, eigenvectors, and the computation of diagonalization.
References
[1] Costas Papachristou. PHYSICS EDUCATION: A COLLECTION OF ARTICLES. Jan. 2025.
[2] Gilbert Strang. Introduction to Linear Algebra, Fifth Edition. Wellesley-Cambridge Press, 2016.


