1 00:00:01,960 --> 00:00:18,840 OK, hi guys. Welcome to CS445 Computational Photography. I'm Ashwin Ramesh, and today I'll be doing a brief prerequisite review of some of the linear algebra concepts that you might need to use in this course. 2 00:00:18,840 --> 00:00:30,260 Note that this should just serve as a guide for what you should review before you embark on the projects, rather than a comprehensive review. 3 00:00:30,260 --> 00:00:33,460 With that, let's get started. 4 00:00:33,460 --> 00:00:38,260 So the goals of this session are as follows: 5 00:00:38,260 --> 00:00:47,140 We're going to talk about systems of linear equations and how to represent them as matrices; go over some basic notations, properties, 6 00:00:47,140 --> 00:00:53,200 setting up systems, and concepts like SVD. 7 00:00:53,200 --> 00:01:01,780 And we're also going to do a demo on Jupyter Notebook of some of the library functions in NumPy 8 00:01:01,780 --> 00:01:05,840 that allow you to do these things. 9 00:01:05,840 --> 00:01:09,100 OK. So, here are some references that you should consult. 10 00:01:09,100 --> 00:01:13,780 The YouTube channel 3Blue1Brown in particular is one that I like a lot. 11 00:01:13,780 --> 00:01:19,680 It will really help you get an intuition for linear algebra concepts. 12 00:01:19,680 --> 00:01:23,820 Really great visualizations. I encourage you to check that out. 13 00:01:23,820 --> 00:01:32,740 The first one, the Stanford review PDF, is a good reference for sure. 14 00:01:32,740 --> 00:01:35,380 OK. Let's move on. 15 00:01:35,380 --> 00:01:44,740 So, this is a system of linear equations. This particular system has two equations and two variables. 16 00:01:44,740 --> 00:01:53,900 So, we know it's linear because the highest power of any variable in the equations is 1. 17 00:01:54,040 --> 00:01:58,180 And we know that these... 18 00:01:58,180 --> 00:02:03,280 So, what else do we know? We also know that these equations are linearly independent. 19 00:02:03,280 --> 00:02:13,080 Why do we know this? You can pause and take a look at it, if you would like, but the reason we know is because... 20 00:02:13,080 --> 00:02:21,000 Because the equations are not linear transformations of each other. 21 00:02:21,000 --> 00:02:30,280 So let's look at how to represent this in matrix form. This system is in the form Ax = b, 22 00:02:30,280 --> 00:02:39,740 where A is the matrix of coefficients of the variables and b is a vector of the right hand side constants. 23 00:02:39,740 --> 00:02:48,060 Something to note in this is that you can interchange the equations above, 24 00:02:48,060 --> 00:02:54,280 and that doesn't really affect the solution values of x1 and x2. 25 00:02:54,280 --> 00:03:02,800 So consequently, you can interchange the rows in A and b, and that won't affect the solution to the system. 26 00:03:02,800 --> 00:03:15,100 And, we actually use this property quite extensively when we're solving systems of linear equations using row reduction, and etc. 27 00:03:15,100 --> 00:03:27,040 Another benefit of converting the system into matrix form is that we can employ a lot of parallelism (thread-based parallelism). 28 00:03:27,040 --> 00:03:33,940 Matrices are easy to represent in computers. 29 00:03:35,240 --> 00:03:39,880 OK, so let's look at some basic notations. 30 00:03:39,880 --> 00:03:50,020 So the i-th element of a vector x (and usually the vectors we deal with in this course are column vectors), 31 00:03:50,020 --> 00:03:56,960 is denoted by xi. So the i-th element is denoted by xi, 32 00:03:56,960 --> 00:04:05,640 a matrix "A" with m rows and n columns is said to be in R m cross n, R^(mxn). 33 00:04:05,640 --> 00:04:14,960 And each element is denoted by a lowercase "a" subscripted with the row index, then the column index. 34 00:04:14,960 --> 00:04:18,860 Let's look at some properties of vectors. 35 00:04:18,860 --> 00:04:31,140 A norm of a vector: As you know, a vector is a quantity with direction and magnitude, and a norm is some measure of the magnitude of a vector. 36 00:04:31,140 --> 00:04:42,040 And, a particular norm that's important to remember is the L2 or Euclidean norm. And, it's denoted like that: 37 00:04:42,040 --> 00:04:52,080 It's basically equal to the square root of the sum of the squares of each element in the vector. 38 00:04:53,940 --> 00:04:57,060 So, some vector-vector relationships: 39 00:04:57,060 --> 00:05:07,020 Given two column vectors x and y of the same length, the quantity x transpose y (x^T y) is called the "inner product" or "dot product" of the vectors. 40 00:05:07,020 --> 00:05:18,580 And, it's computed as shown: You basically take corresponding elements, multiply them, and then sum together all the products, 41 00:05:18,580 --> 00:05:20,500 in each vector. 42 00:05:20,500 --> 00:05:30,780 The dot product is a very important operation in machine learning and image processing, etc., 43 00:05:30,780 --> 00:05:33,700 but more generally, as well. 44 00:05:33,700 --> 00:05:42,380 There's also something called the "outer product", and this operation doesn't really require the input vectors to be of the same length. 45 00:05:42,380 --> 00:05:55,120 So, given an m-length vector and an n-length vector, the result x y transpose (x y^T) is going to be of dimensions m cross n. 46 00:05:55,120 --> 00:06:05,820 And the entries of this matrix, each row is going to be the corresponding value in the first vector multiplied by the second vector, 47 00:06:05,820 --> 00:06:14,400 and each column is consequently the first whole vector multiplied by the corresponding column in the second vector. 48 00:06:14,400 --> 00:06:18,320 So, you can observe that relationship. 49 00:06:20,080 --> 00:06:25,540 OK. So now, let's talk about some operations on matrices. 50 00:06:25,540 --> 00:06:29,660 The main one is matrix multiplication. 51 00:06:29,660 --> 00:06:37,920 The product of two matrices A (with m rows and n columns) and B (with n rows and p columns) is the matrix C (with m rows and p columns). 52 00:06:37,920 --> 00:06:47,660 So the most important thing to note is the number of columns in the left matrix must be equal to the number of rows in the right matrix. 53 00:06:47,660 --> 00:06:57,060 So, matrix multiplication really depends on the order in which you multiply the matrices A and B. (We'll discuss about that in a second.) 54 00:06:57,060 --> 00:07:09,580 And each element of C, Cij, is calculated as shown below: Cij equals the sum from k=1 to n of Aik Bkj. 55 00:07:09,580 --> 00:07:20,220 The way I like to visualize it is that each row of the left matrix A 56 00:07:20,220 --> 00:07:32,560 gets dotted with each column of the right matrix B to produce one element of the matrix C. 57 00:07:32,560 --> 00:07:36,280 That's the best way to look at it, for me. 58 00:07:36,280 --> 00:07:45,180 So let's look at some properties of matrix multiplication. It is associative, which means that if you have three matrices multiplied together, 59 00:07:45,180 --> 00:07:53,620 you can do any combination first between AB and BC there: (AB)C = A(BC). 60 00:07:53,620 --> 00:08:00,980 Matrix multiplication is also distributive: A(B+C) = AB + AC. But it is in general not commutative. 61 00:08:00,980 --> 00:08:07,360 As I mentioned earlier, it can be that AB is not the same as BA. 62 00:08:07,360 --> 00:08:16,740 Sometimes BA is not even legal. It's not even defined. (The dimensions must work out as described earlier.) 63 00:08:19,600 --> 00:08:24,180 Another operation we can do on a matrix is the transpose. 64 00:08:24,180 --> 00:08:29,680 The transpose is basically just an interchange of rows and columns. 65 00:08:29,680 --> 00:08:33,120 The rows become columns, and all the columns become rows. 66 00:08:33,120 --> 00:08:37,200 And the transpose has the following properties: 67 00:08:37,200 --> 00:08:43,020 The transpose of the transpose of A, (A^T)^T, just gives back the original matrix A. 68 00:08:43,020 --> 00:08:49,680 The transpose of the product of two matrices is interesting: (AB)^T = B^T A^T 69 00:08:49,680 --> 00:08:57,340 The left product, AB, the transpose of that is just the right product, B transpose A transpose. 70 00:08:57,340 --> 00:08:59,720 So that's important to note. 71 00:08:59,720 --> 00:09:08,040 And then A plus B, the whole transpose: (A+B)^T, is just A^T + B^T. So the transpose there is distributive. 72 00:09:10,620 --> 00:09:21,520 Matrices have a property known as "rank", and that's given by the number of linearly independent columns and rows of the matrix. 73 00:09:21,520 --> 00:09:34,260 And if the rank of the matrix is equal to the minimum of its number of rows and number of columns then it's said to have full rank. 74 00:09:34,260 --> 00:09:46,180 So the rank of a matrix can be at most the minimum number of rows or columns, and the way you can think about that is, 75 00:09:49,140 --> 00:09:57,300 what's the most linearly independent rows you can have in an m by n matrix? 76 00:09:57,300 --> 00:10:03,160 And what's the most linearly independent columns you can have in an m by n matrix? 77 00:10:03,160 --> 00:10:09,080 Yeah, so I encourage you to think about that. 78 00:10:09,720 --> 00:10:15,820 So now the inverse of a matrix: The inverse is only defined for 79 00:10:15,820 --> 00:10:20,000 square matrices. That's important to note. 80 00:10:20,000 --> 00:10:23,560 And it's denoted by "A inverse", A^(-1), as shown there. 81 00:10:23,560 --> 00:10:36,400 And it's the unique matrix such that A^(-1) A is equal to A A^(-1) and they're both equal to the identity matrix of size n. 82 00:10:36,400 --> 00:10:46,120 So that's important. The identity matrix of size n is an n by n square matrix with all 0s, except 1s on the major diagonal. 83 00:10:47,000 --> 00:10:54,400 OK. Now let's talk about eigenvalues and eigenvectors. Given a square matrix A in R^(m x n), 84 00:10:54,400 --> 00:11:02,400 lambda is an eigenvalue of A with a corresponding eigenvector of x if and only if Ax = lambda x, 85 00:11:02,400 --> 00:11:07,040 and x is not a zero vector, so it's not a trivial solution. 86 00:11:08,180 --> 00:11:12,240 The determinant of A is equal to the product of its eigenvalues, 87 00:11:12,240 --> 00:11:17,700 so the product of all possible values of lambda is the determinant of the matrix. 88 00:11:17,700 --> 00:11:24,920 If you remember, the determinant of the matrix is proportional to this area of 89 00:11:24,920 --> 00:11:32,460 this n-dimensional parallelepiped, composed of the column vectors in A. 90 00:11:32,780 --> 00:11:44,160 It also happens to be the case that the sum of the eigenvalues is equal to the trace of the matrix A, which is the sum of the values on its major diagonal. 91 00:11:44,160 --> 00:11:51,220 So in the case of the example shown below, A has columns [1, 3], [3, 2]. 92 00:11:51,220 --> 00:11:57,220 The trace would be 1+2 = 3. And that would be the sum of the eigenvalues. 93 00:11:57,220 --> 00:12:01,220 OK, now we're going to talk about diagonalizable matrices. 94 00:12:02,320 --> 00:12:12,180 So, for a matrix A in R^(n x n), if A has n linearly independent eigenvectors, then A is said to be diagonalizable. 95 00:12:12,180 --> 00:12:23,460 So in math that looks like being able to decompose A into the form U, D, U inverse. 96 00:12:23,460 --> 00:12:29,720 This is a diagonalization of A. We're going to talk more about this, but 97 00:12:29,720 --> 00:12:35,600 basically what the definition says is that a matrix A is diagonalizable if and only if 98 00:12:35,600 --> 00:12:42,740 it has n linearly independent eigenvectors, and here we'll look at why that's the case. 99 00:12:42,740 --> 00:12:47,880 So on the left here we have essentially the definition of 100 00:12:47,880 --> 00:12:56,320 what it means to be an eigenvector, so u1, u2, all the way to un, are the eigenvectors of A. 101 00:12:56,320 --> 00:13:03,060 and lambda: this should be λ1 (lambda 1), λ2, all the way to λn. 102 00:13:03,060 --> 00:13:07,060 These lambda values are the corresponding eigenvalues of A. 103 00:13:07,720 --> 00:13:21,340 Now imagine kind of blocking out or expressing these as their corresponding matrices here. This is just a diagrammatic representation of what that looks like, these equations. 104 00:13:21,340 --> 00:13:27,700 So here you have A, and then what you can do is stack the column vectors, 105 00:13:27,700 --> 00:13:33,820 (stack the eigenvectors as column vectors,) into this matrix, right here, 106 00:13:33,820 --> 00:13:40,880 this n cross n matrix where each column of this matrix is an eigenvector of A. 107 00:13:41,980 --> 00:13:56,360 And on the right-hand side here you have another n by n matrix where each column is going to be λi ui for i from 1 to n. 108 00:13:56,360 --> 00:14:04,840 Essentially what this is showing us is this entire system of linear equations. 109 00:14:04,840 --> 00:14:19,460 It's telling us that Au=λu for all matching values of λ and u, for 1 through n. 110 00:14:19,460 --> 00:14:23,260 OK. So... 111 00:14:23,760 --> 00:14:42,520 You can take this right-hand side matrix, and you can actually decompose it into just a matrix of eigenvectors (here, that's the U matrix again), and a diagonal matrix 112 00:14:42,520 --> 00:14:45,660 that is one where only the diagonals are nonzero, 113 00:14:45,660 --> 00:14:49,160 (only the elements of the major diagonal are nonzero,) 114 00:14:49,160 --> 00:14:58,300 where in each column the position at the major diagonal is occupied by the corresponding eigenvalue. 115 00:14:58,300 --> 00:15:00,300 So this is what that looks like. 116 00:15:00,300 --> 00:15:08,740 Everything is zero except for the values in the major diagonal, and those values are just λ1 through λn. 117 00:15:08,740 --> 00:15:15,360 You can actually try this. If you multiply this out, you'll get another n by n matrix, right, 118 00:15:15,360 --> 00:15:20,500 where the first column is λ1u1, the second column is λ2u2, etc. 119 00:15:20,500 --> 00:15:26,740 all the way to the last column being λn un. That's essentially this matrix here. 120 00:15:26,920 --> 00:15:37,440 So what you have is, A times this matrix of eigenvectors equals that same matrix of eigenvectors times this diagonal matrix. 121 00:15:37,440 --> 00:15:40,460 We're going to call this diagonal matrix D. 122 00:15:40,460 --> 00:15:50,680 And what we can do here is, just right-multiply by the inverse of U to get this expression for A: 123 00:15:50,680 --> 00:15:57,920 A=UDU^(-1). Notice that this is only possible if U is invertible. 124 00:15:57,920 --> 00:16:10,720 That means that U inverse exists, U is a non-singular matrix, which requires that each eigenvector be linearly independent of all the others. 125 00:16:10,720 --> 00:16:14,200 It requires that A has n linearly independent eigenvectors. 126 00:16:14,200 --> 00:16:18,320 So that's how we get that A can be decomposed into 127 00:16:18,320 --> 00:16:25,820 a matrix of eigenvectors times this diagonal matrix of eigenvalues times the inverse of the matrix of eigenvectors 128 00:16:25,820 --> 00:16:31,500 if and only if it has n linearly independent eigenvectors. 129 00:16:31,500 --> 00:16:38,000 OK, lastly we're going to talk about singular value decomposition (SVD). 130 00:16:38,000 --> 00:16:45,920 SVD, singular value decomposition is a way to factorize a matrix into three matrices. 131 00:16:45,920 --> 00:16:56,620 So given a matrix A, we can write A as a product of three matrices U, Σ, and V transpose. 132 00:16:56,620 --> 00:17:03,980 Here U and V are orthogonal matrices and Σ is a diagonal matrix. 133 00:17:03,980 --> 00:17:15,280 Remember that orthogonal matrices are two matrices that have completely orthogonal column vectors 134 00:17:15,280 --> 00:17:22,220 which means that the dot product of any pair of those column vectors is zero. 135 00:17:22,580 --> 00:17:34,240 It's also true for orthogonal matrices that their transpose is also their inverse. You can do a quick example of this to find out why that's the case. 136 00:17:34,240 --> 00:17:39,640 So let's look at how to calculate singular value decompositions. 137 00:17:39,640 --> 00:17:44,840 So what exactly are these U and V matrices? 138 00:17:44,840 --> 00:17:50,500 Let's assume that you can decompose A into something like U Σ V^T. [Note that Σ=Σ^Τ because Σ is a diagonal matrix.] 139 00:17:50,500 --> 00:17:59,080 So A^T would be VΣU^T, and the matrix A^T A would be this: 140 00:17:59,600 --> 00:18:09,680 And we know that because U is orthogonal, its transpose is also its inverse, so these two (U^T U) actually become the identity matrix. 141 00:18:09,680 --> 00:18:24,660 So you end up with (A^T)A is V(Σ^T)Σ(V^T) and we know that Σ is a diagonal matrix so its transpose is itself. 142 00:18:24,660 --> 00:18:41,200 And, we know that therefore Σ^T Σ is going to be just a diagonal matrix of the diagonal values squared. 143 00:18:41,200 --> 00:18:49,520 So we let D=(Σ^T) Σ and there we have (A^T) A = V D (V^T), 144 00:18:49,520 --> 00:18:54,300 which basically looks like a diagonalization of A^T A. 145 00:18:54,300 --> 00:19:04,040 So, what are V and V^T? They're just the eigenvectors of A^T A. 146 00:19:04,040 --> 00:19:12,620 Similarly, if you compute A(A^T), that's going to give you UΣ(V^T)VΣ(U^T). 147 00:19:12,620 --> 00:19:24,040 And V is also an orthogonal matrix, so with similar arguments you can show that U is actually the matrix of eigenvectors of AA^T. 148 00:19:24,040 --> 00:19:35,000 So, here that is. We're doing SVD on a matrix M, which is an (m x n) matrix, 149 00:19:35,000 --> 00:19:43,400 and we have that U's columns are the eigenvectors of MM^T, right. 150 00:19:43,400 --> 00:19:57,500 The diagonals of S, which was sigma (Σ) in our previous notation, are the square root of the eigenvalues of (M^T)M. 151 00:19:58,680 --> 00:20:12,600 So this is because we saw that Σ^T Σ is the D matrix in the previous example, here, 152 00:20:14,720 --> 00:20:20,740 right, and these are called the eigenvalues of M^T M so, 153 00:20:20,740 --> 00:20:25,480 the diagonals of S are just the square root of those values. 154 00:20:25,480 --> 00:20:28,660 So to compute these diagonal values you would just 155 00:20:28,660 --> 00:20:33,960 compute the eigenvalues of M^T M and take the square root. 156 00:20:33,960 --> 00:20:39,600 These σ1, σ2, (...) are called the singular values of M. 157 00:20:39,600 --> 00:20:45,500 The columns of V are the eigenvectors of M^T M. 158 00:20:45,500 --> 00:20:55,820 So note that in the decomposition you actually have V^T, so the rows of V^T are the eigenvectors of M^T M. 159 00:20:55,820 --> 00:21:00,040 Good, so that was singular value decomposition. 160 00:21:00,040 --> 00:21:10,780 Now what we're going to do is look at a quick demo of all the linear algebra concepts we've learned and how to use them in NumPy. 161 00:21:11,100 --> 00:21:24,960 OK. So here we are on Jupyter Notebook, and we're just going to go over what we can use to implement these linear algebra concepts we looked at in Python. 162 00:21:26,260 --> 00:21:32,320 The way we generally want to define vectors is as NumPy arrays. 163 00:21:32,320 --> 00:21:38,500 So here we go, here we have three vectors x, y, and z. They're just NumPy arrays. 164 00:21:38,500 --> 00:21:52,220 We can use np.zeros to define zero vectors. We can just give it a 4 here and it's going to give us a zero vector of size four. 165 00:21:53,480 --> 00:22:01,540 And some operations of vectors like we discussed, like the dot product or inner product, 166 00:22:01,540 --> 00:22:08,540 is just a function called numpy.dot (here, imported as np.dot). 167 00:22:08,540 --> 00:22:23,980 I would consult the documentation for np.dot because it's a really versatile function and it takes care of a wide range of use cases. 168 00:22:23,980 --> 00:22:30,340 I would look at this documentation. So if both a and b are 1-D arrays, it's an inner product. 169 00:22:30,340 --> 00:22:38,880 It can also do matrix multiplication when both the inputs are 2-D arrays and there are some other use cases as well. 170 00:22:39,500 --> 00:22:48,740 So, yeah. So let's take a look at... 171 00:22:48,740 --> 00:22:54,260 Yeah, here we go. So that's how we use np.dot for two vectors. 172 00:22:54,260 --> 00:23:09,440 Next is the outer product, and as we know the outer product takes two vectors x and y, and they can be of different lengths, and it returns a matrix. 173 00:23:09,440 --> 00:23:17,700 If x is of length m and y is of length n it returns an mxn matrix. 174 00:23:18,500 --> 00:23:23,060 So, I'd encourage you to investigate that as well. 175 00:23:23,060 --> 00:23:36,420 Let's look at an example here. Let x equal an np.ones of size five: n = np.ones(5) 176 00:23:36,420 --> 00:23:48,860 And then we'll do, y = np.arange((6)) 177 00:23:48,860 --> 00:23:55,000 And then we'll do, np.outer(x,y). OK. 178 00:23:55,000 --> 00:24:01,620 You can pause the video here and kind of think about what this might give us. 179 00:24:01,620 --> 00:24:09,060 As we expected, it's a 5x6 matrix, and each row is 180 00:24:09,060 --> 00:24:13,900 the corresponding element of x times all of y, 181 00:24:13,900 --> 00:24:19,900 and each column is the corresponding element of y times x. 182 00:24:24,180 --> 00:24:31,380 OK. Let's look at the Hadamard product, or element-wise multiplication. 183 00:24:31,380 --> 00:24:37,980 So, between two similarly-shaped arrays (arrays that can be broadcast together), 184 00:24:37,980 --> 00:24:43,380 the star operation is the element-wise multiplication operation 185 00:24:43,380 --> 00:24:48,280 so you can also achieve this with a function called np.multiply 186 00:24:48,280 --> 00:24:54,440 Let's actually recalculate this because our x has changed. They couldn't be broadcast together. 187 00:24:54,440 --> 00:25:00,560 We'll redefine x here (to be of size 4), and then run this again. 188 00:25:00,560 --> 00:25:10,300 Our x*z is [1, 2, 3, 0]. Now let's see what this gives us (with np.multiply). Yep, these two are the same. 189 00:25:12,120 --> 00:25:21,400 I'd also look at the documentation for numpy.multiply. Here it tells us exactly what it is. It multiplies arguments element-wise. 190 00:25:21,400 --> 00:25:24,240 All right, moving on. 191 00:25:24,240 --> 00:25:34,780 NumPy has a module known as linalg that has multiple very useful abstractions for linear algebra. 192 00:25:34,780 --> 00:25:42,780 One specifically is the norm. let's take a look at "numpy.linalg.norm"... 193 00:25:46,160 --> 00:25:52,460 Yeah, so, you pass in the input as the first argument, 194 00:25:52,460 --> 00:25:55,900 and the second argument is the type of norm you'd like to calculate. 195 00:25:55,900 --> 00:25:59,900 So this is the L1 norm, and this would be the L2 norm. 196 00:25:59,900 --> 00:26:05,340 The L2 or Euclidean norm is the one we discussed in class. 197 00:26:05,340 --> 00:26:08,400 So, this would be that. 198 00:26:08,400 --> 00:26:12,400 All right, now we'll move on to matrix multiplication. 199 00:26:12,400 --> 00:26:16,400 So let's define two matrices here, A and B. 200 00:26:16,400 --> 00:26:27,480 A is of shape 2x3 and B is of shape 3x4 so our result should be of shape... 201 00:26:27,480 --> 00:26:31,480 Uh, well, pause the video, figure it out, but... 202 00:26:31,480 --> 00:26:37,480 we do have the answer here, and the result is of shape 2x4. 203 00:26:37,480 --> 00:26:43,480 Notice that we've used "a.dot" here, and so the output of a.dot... 204 00:26:43,480 --> 00:26:49,480 or otherwise, we could put this in as np.dot(a,b)... 205 00:26:49,480 --> 00:26:54,880 Oops, I forgot to run these. 206 00:26:57,260 --> 00:27:01,820 When they're 2D arrays as the input, it does matrix multiplication. 207 00:27:01,820 --> 00:27:12,040 You can also use the "at" symbol (@) for matrix multiplication. That will give you the same thing. [But, np.dot and @ behave differently for vectors.] https://stackoverflow.com/q/54160155 208 00:27:12,040 --> 00:27:19,020 Yeah, this is a simple way to represent matrix multiplication as long as the shapes are matched. 209 00:27:19,720 --> 00:27:32,520 There won't be any assumptions on the part of "np.dot" if you do it this way, so you'll be guaranteed that you're doing what you're trying to do. 210 00:27:32,520 --> 00:27:46,220 The transpose is a property of the matrix. You can just call ".T". Let's take a "b.T". 211 00:27:46,220 --> 00:27:53,520 Right. And we can do a "(a@b).T"... Yep, so... 212 00:27:53,520 --> 00:27:57,200 It (the transpose) will just interchange the rows and columns. 213 00:27:57,200 --> 00:28:07,620 linalg allows you to calculate the matrix rank as well, so let's calculate the rank of a, the rank of a@b, yeah. "np.linalg.matrix_rank(a@b)" 214 00:28:08,140 --> 00:28:13,060 This is an important set of things to note. 215 00:28:13,060 --> 00:28:23,060 Now we're going to get to the higher-level functions in linalg, namely computing the inverse. 216 00:28:23,580 --> 00:28:29,660 "np.linalg.inv" is going to be the inverse. 217 00:28:29,660 --> 00:28:36,480 And what this is computing the inverse of is b b^T (as "b.dot(b.T)") 218 00:28:36,480 --> 00:28:41,660 The SVD can be computed using np.linalg.svd. 219 00:28:43,520 --> 00:29:13,000 If you remember, the SVD of a matrix b decomposes it into a matrix of the eigenvectors of bb^T times a diagonal matrix of its singular values times a matrix of the eigenvectors of b^T b, so... 220 00:29:13,000 --> 00:29:20,160 That's why, just make sure that you know what u, s, and vh are here. That's important to know. 221 00:29:22,240 --> 00:29:28,040 And I would look up the documentation for this, to look at its various arguments. 222 00:29:28,040 --> 00:29:37,980 Another thing we're going to need to do is solve linear systems using NumPy, so take this linear system, for example. 223 00:29:37,980 --> 00:29:41,340 We looked at this at the beginning of the linear algebra slides. 224 00:29:41,340 --> 00:29:54,720 We'll represent this as an array: [[4, -5], [-2, 3]] for the matrix of coefficients, and then [-13, 9] for the constant vector. 225 00:29:54,720 --> 00:30:08,380 And then we'll just use x = np.linalg.solve, and "solve" takes the matrix of coefficients first and then the constant vectors. 226 00:30:09,880 --> 00:30:14,520 And the last thing to note is about broadcasting, 227 00:30:14,520 --> 00:30:22,180 Broadcasting is just a set of rules that you need to follow when you're doing operations on two arrays. 228 00:30:25,080 --> 00:30:31,000 To do an operation between two NumPy arrays they need to be compatible, broadcastable. 229 00:30:31,000 --> 00:30:36,660 So when operating on two arrays, NumPy compares their shapes 230 00:30:36,660 --> 00:30:41,720 element-wise starting from the end and working its way to the beginning. 231 00:30:41,720 --> 00:30:46,380 Two dimensions are compatible when they're equal or one of them is 1. 232 00:30:46,380 --> 00:30:54,480 Let's check out an example here: A is a 4-dimensional array of ones, of shape (3,1,2,4). 233 00:30:54,480 --> 00:30:58,180 B is a 4-dimensional array of ones of shape (1,2,2,1). 234 00:30:58,180 --> 00:31:05,540 So let's just compare these arrays the way NumPy would starting from the end, (from the right,) 235 00:31:05,540 --> 00:31:10,460 and going to the left, and try to figure out whether or not they're broadcastable. 236 00:31:10,460 --> 00:31:14,340 So, are these two equal: 4 and 1? No. 237 00:31:14,340 --> 00:31:17,760 But one of them is 1, so these are compatible. 238 00:31:17,760 --> 00:31:22,820 Are these two equal: 2 and 2? Yes. And here in the remaining two dimensions, we have 1s. 239 00:31:22,820 --> 00:31:29,460 So A and B are broadcastable, so let's multiply A and B and check the shape. 240 00:31:29,460 --> 00:31:40,620 OK, there we go. So, what happened was NumPy broadcasted these arrays together and the result had shape (3,2,2,4). 241 00:31:40,620 --> 00:31:53,540 For each dimension, the result will take the non-unit shape, so for the first from the right it took 4, then 2, 2, 3. 242 00:31:53,540 --> 00:31:56,460 OK. Moving on... 243 00:31:56,460 --> 00:32:09,360 Here we created two arrays, and they're of different shapes. The first one is of shape 2x2, the second one is of shape 2x1, 244 00:32:10,460 --> 00:32:17,600 What will the resulting shape be? We know from our procedure above that it'll be 2x2. 245 00:32:17,600 --> 00:32:22,940 And the way NumPy does the broadcasting is that it will duplicate 246 00:32:22,940 --> 00:32:28,220 this column vector (of B) along the dimension that needs to be matched for this array (A). 247 00:32:28,220 --> 00:32:37,480 So, you'll actually end up with a 2x2 array of 3s and that will get added to A. 248 00:32:37,480 --> 00:32:48,540 So that's very important to note: how exactly NumPy does the broadcasting and what it fills into the newly-formed arrays when it broadcasts. 249 00:32:48,540 --> 00:32:54,300 OK! Well, thank you for watching this linear algebra tutorial. 250 00:32:54,300 --> 00:32:59,480 If you have any questions, as always, please feel free to ask on Piazza.