Factor analysis of mixed data
Resources
🤷♂️
Data
Factor analysis of mixed data is a general purpose method. It supports both numeric and categorical data.
import prince
dataset = prince.datasets.load_beers().head(1000)
dataset.head()
| is_organic | style | alcohol_by_volume | international_bitterness_units | standard_reference_method | final_gravity |
---|
name | | | | | | |
---|
Lightshine Radler | False | Blonde | 4.50 | 20.0 | 5.0 | 1.012 |
---|
LightSwitch Lager | False | American Light Lager | 3.95 | 7.5 | 3.0 | 1.005 |
---|
Lightwave Belgian Pale | False | Belgian Pale | 5.00 | 25.0 | 9.0 | 1.011 |
---|
Like Weisse | False | Berlinerweisse | 3.10 | 4.5 | 3.0 | 1.005 |
---|
Lil Heaven Session IPA | False | Session | 4.55 | 20.0 | 2.0 | 1.007 |
---|
Fitting
famd = prince.FAMD(
n_components=2,
n_iter=3,
copy=True,
check_input=True,
random_state=42,
engine="sklearn",
handle_unknown="error" # same parameter as sklearn.preprocessing.OneHotEncoder
)
famd = famd.fit(dataset)
Eigenvalues
| eigenvalue | % of variance | % of variance (cumulative) |
---|
component | | | |
---|
0 | 3.735 | 3.70% | 3.70% |
---|
1 | 1.662 | 1.65% | 5.34% |
---|
Coordinates
famd.row_coordinates(dataset).head()
component | 0 | 1 |
---|
name | | |
---|
Lightshine Radler | -1.795872 | -0.316854 |
---|
LightSwitch Lager | -3.351119 | -0.193896 |
---|
Lightwave Belgian Pale | -1.429076 | -0.083288 |
---|
Like Weisse | -3.774585 | -0.255144 |
---|
Lil Heaven Session IPA | -2.570021 | -0.069867 |
---|
component | 0 | 1 |
---|
variable | | |
---|
alcohol_by_volume | 8.474727e-01 | 0.024329 |
---|
international_bitterness_units | 6.648504e-01 | 0.224303 |
---|
standard_reference_method | 3.828369e-01 | 0.386307 |
---|
final_gravity | 8.401140e-01 | 0.025106 |
---|
is_organic | 6.361371e-07 | 0.002685 |
---|
style | 9.994811e-01 | 0.999218 |
---|
Visualization
famd.plot(
dataset,
x_component=0,
y_component=1
)
Contributions
(
famd.row_contributions_
.sort_values(0, ascending=False)
.head(5)
.style.format('{:.3%}')
)
component | 0 | 1 |
---|
name | | |
---|
Agamemnon | 0.536% | 0.255% |
---|
Hide the Despot | 0.536% | 0.202% |
---|
Game of Thrones: King In the North | 0.536% | 0.202% |
---|
Bakery: Banana Bread | 0.536% | 0.202% |
---|
Entire Wood Aged Stout | 0.536% | 0.202% |
---|
famd.column_contributions_.style.format('{:.0%}')
component | 0 | 1 |
---|
variable | | |
---|
alcohol_by_volume | 23% | 1% |
---|
international_bitterness_units | 18% | 13% |
---|
standard_reference_method | 10% | 23% |
---|
final_gravity | 22% | 2% |
---|
is_organic | 0% | 0% |
---|
style | 27% | 60% |
---|