Prince foo

Generalized Procrustes analysis

Resources

🤷‍♂️

User guide

Generalized procrustes analysis (GPA) is a shape analysis tool that aligns and scales a set of shapes to a common reference. Here, the term “shape” means an ordered sequence of points. GPA iteratively 1) aligns each shape with a reference shape (usually the mean shape), 2) then updates the reference shape, 3) repeating until converged.

Note that the final rotation of the aligned shapes may vary between runs, based on the initialization.

Here is an example aligning a few right triangles:

import pandas as pd

points = pd.DataFrame(
    data=[
        [0, 0, 0, 0],
        [0, 2, 0, 1],
        [1, 0, 0, 2],
        [3, 2, 1, 0],
        [1, 2, 1, 1],
        [3, 3, 1, 2],
        [0, 0, 2, 0],
        [0, 4, 2, 1],
        [2, 0, 2, 2],
    ],
    columns=['x', 'y', 'shape', 'point']
).astype({'x': float, 'y': float})
points.head(3)

xyshapepoint
00.00.000
10.02.001
21.00.002
import altair as alt

alt.Chart(points).mark_line(opacity=0.5).encode(
    x='x',
    y='y',
    detail='shape',
    color='shape:N'
)

The dataframe of points has to converted to a 3D numpy array of shape (shapes, points, dims). There are many ways to do this. Here, we use xarray as a helper package.

ds = points.set_index(['shape', 'point']).to_xarray()
da = ds.to_stacked_array('xy', ['shape', 'point'])
shapes = da.values
shapes.shape
(3, 3, 2)

This can also be done in NumPy:

import numpy as np

gb = points.groupby('shape')
np.stack([gb.get_group(g)[['x', 'y']] for g in gb.groups]).shape
(3, 3, 2)
shapes
array([[[0., 0.],
        [0., 2.],
        [1., 0.]],

       [[3., 2.],
        [1., 2.],
        [3., 3.]],

       [[0., 0.],
        [0., 4.],
        [2., 0.]]])

The shapes can now be aligned.

import prince

gpa = prince.GPA()
aligned_shapes = gpa.fit_transform(shapes)

We then convert the 3D numpy array to a dataframe (using xarray) for plotting.

da.values = aligned_shapes
aligned_points = da.to_unstacked_dataset('xy').to_dataframe().reset_index()

alt.Chart(aligned_points).mark_line(opacity=0.5).encode(
    x='x',
    y='y',
    detail='shape',
    color='shape:N'
)

The triangles were all the same shape, so they are now perfectly aligned.