# Generalized Procrustes analysis

## Resources

🤷♂️

## User guide

Generalized procrustes analysis (GPA) is a shape analysis tool that aligns and scales a set of shapes to a common reference. Here, the term “shape” means an *ordered* sequence of points. GPA iteratively 1) aligns each shape with a reference shape (usually the mean shape), 2) then updates the reference shape, 3) repeating until converged.

Note that the final rotation of the aligned shapes may vary between runs, based on the initialization.

Here is an example aligning a few right triangles:

```
import pandas as pd
points = pd.DataFrame(
data=[
[0, 0, 0, 0],
[0, 2, 0, 1],
[1, 0, 0, 2],
[3, 2, 1, 0],
[1, 2, 1, 1],
[3, 3, 1, 2],
[0, 0, 2, 0],
[0, 4, 2, 1],
[2, 0, 2, 2],
],
columns=['x', 'y', 'shape', 'point']
).astype({'x': float, 'y': float})
points.head(3)
```

x | y | shape | point | |
---|---|---|---|---|

0 | 0.0 | 0.0 | 0 | 0 |

1 | 0.0 | 2.0 | 0 | 1 |

2 | 1.0 | 0.0 | 0 | 2 |

```
import altair as alt
alt.Chart(points).mark_line(opacity=0.5).encode(
x='x',
y='y',
detail='shape',
color='shape:N'
)
```

The dataframe of points has to converted to a 3D numpy array of shape `(shapes, points, dims)`

. There are many ways to do this. Here, we use xarray as a helper package.

```
ds = points.set_index(['shape', 'point']).to_xarray()
da = ds.to_stacked_array('xy', ['shape', 'point'])
shapes = da.values
shapes.shape
```

```
(3, 3, 2)
```

This can also be done in NumPy:

```
import numpy as np
gb = points.groupby('shape')
np.stack([gb.get_group(g)[['x', 'y']] for g in gb.groups]).shape
```

```
(3, 3, 2)
```

```
shapes
```

```
array([[[0., 0.],
[0., 2.],
[1., 0.]],
[[3., 2.],
[1., 2.],
[3., 3.]],
[[0., 0.],
[0., 4.],
[2., 0.]]])
```

The shapes can now be aligned.

```
import prince
gpa = prince.GPA()
aligned_shapes = gpa.fit_transform(shapes)
```

We then convert the 3D numpy array to a dataframe (using `xarray`

) for plotting.

```
da.values = aligned_shapes
aligned_points = da.to_unstacked_dataset('xy').to_dataframe().reset_index()
alt.Chart(aligned_points).mark_line(opacity=0.5).encode(
x='x',
y='y',
detail='shape',
color='shape:N'
)
```

The triangles were all the same shape, so they are now perfectly aligned.