Prince foo

Factor analysis of mixed data

Resources

🤷‍♂️

Data

Factor analysis of mixed data is a general purpose method. It supports both numeric and categorical data.

import prince

dataset = prince.datasets.load_beers().head(1000)
dataset.head()

is_organicstylealcohol_by_volumeinternational_bitterness_unitsstandard_reference_methodfinal_gravity
name
Lightshine RadlerFalseBlonde4.5020.05.01.012
LightSwitch LagerFalseAmerican Light Lager3.957.53.01.005
Lightwave Belgian PaleFalseBelgian Pale5.0025.09.01.011
Like WeisseFalseBerlinerweisse3.104.53.01.005
Lil Heaven Session IPAFalseSession4.5520.02.01.007

Fitting

famd = prince.FAMD(
    n_components=2,
    n_iter=3,
    copy=True,
    check_input=True,
    random_state=42,
    engine="sklearn",
    handle_unknown="error"  # same parameter as sklearn.preprocessing.OneHotEncoder
)
famd = famd.fit(dataset)

Eigenvalues

famd.eigenvalues_summary

eigenvalue% of variance% of variance (cumulative)
component
03.7353.70%3.70%
11.6621.65%5.34%

Coordinates

famd.row_coordinates(dataset).head()

component01
name
Lightshine Radler-1.795872-0.316854
LightSwitch Lager-3.351119-0.193896
Lightwave Belgian Pale-1.429076-0.083288
Like Weisse-3.774585-0.255144
Lil Heaven Session IPA-2.570021-0.069867
famd.column_coordinates_

component01
variable
alcohol_by_volume8.474727e-010.024329
international_bitterness_units6.648504e-010.224303
standard_reference_method3.828369e-010.386307
final_gravity8.401140e-010.025106
is_organic6.361371e-070.002685
style9.994811e-010.999218

Visualization

famd.plot(
    dataset,
    x_component=0,
    y_component=1
)

Contributions

(
    famd.row_contributions_
    .sort_values(0, ascending=False)
    .head(5)
    .style.format('{:.3%}')
)
component01
name  
Agamemnon0.536%0.255%
Hide the Despot0.536%0.202%
Game of Thrones: King In the North0.536%0.202%
Bakery: Banana Bread0.536%0.202%
Entire Wood Aged Stout0.536%0.202%
famd.column_contributions_.style.format('{:.0%}')
component01
variable  
alcohol_by_volume23%1%
international_bitterness_units18%13%
standard_reference_method10%23%
final_gravity22%2%
is_organic0%0%
style27%60%