# Singular Value Decomposition in Python

## Example: Compression

Near the bottom of the writeup in the AMA, they demonstrate how useful SVD is a way of compressing/uncompressing data.

This will make a good working example as any.

They assemble a `25 X 15`

matrix of zeros and ones that form an image of a `0`

. Per the example, there are really only different types of columns that go into constructing the bigger picture– and thus some obvious information redundancy that we can leverage.

`%pylab inline`

```
Populating the interactive namespace from numpy and matplotlib
```

`a = np.ones((25, 1))`

```
b = np.vstack([np.ones((5, 1)),
np.zeros((15, 1)),
np.ones((5, 1))])
```

```
c = np.vstack([np.ones((5, 1)),
np.zeros((3, 1)),
np.ones((9, 1)),
np.zeros((3, 1)),
np.ones((5, 1))])
```

```
zero = np.c_[a, a, b, b, b, c, c, c, c, c, b, b, b, a, a]
zero.shape
```

```
(25, 15)
```

Altogether, our data looks like

`plt.imshow(zero);`

### In Python

To perform SVD in Python, we can turn to the `numpy`

implementation

`import numpy.linalg as LA`

The `svd()`

function returns all three components of the righthand side of our SVD form `A = U SIGMA V.T`

.

**Note**: The `vh`

term is intentional as it is the conjugate transpose of our matrix `V`

`u, s, vh = LA.svd(zero)`

Inspecting, the shapes of `u`

and `vh`

are square and correspond to the column and row spaces, respectively.

`u.shape, s.shape, vh.shape`

```
((25, 25), (15,), (15, 15))
```

Looking at `s`

, however, it’s obvious that we’re carrying more information than we need.

It only needed 3 singular values to reassemble the information stored in our eigen basis matricies– the rest was simply zero and will be ignored.

```
np.set_printoptions(precision=3, suppress=True)
s
```

```
array([14.724, 5.217, 3.314, 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. , 0. , 0. ])
```

And so we can omit the rows/columns of our matricies accordingly.

```
U = u[:, :3]
S = s[:3]
V = vh[:3, :]
```

`U.shape, S.shape, V.shape`

```
((25, 3), (3,), (3, 15))
```

Finally, reassembling our original matrix using simply vectors and scalars, we have.

`plt.imshow(np.dot(U * S, V));`

It’s worth calling out that our original representation had a number of values equal to:

`zero.shape[0] * zero.shape[1]`

```
375
```

But after paring down to just the columns, singular values, and rows that matter, we have

`(U.shape[0] * U.shape[1]) + S.shape[0] + (V.shape[0] * V.shape[1])`

```
123
```

Less than a third.