

Training parameters

Overview

The following tables gives an overview of all the parameters that can be used for training XGP. The defaults are the same regardless of where you're using XGP from (please open an issue if you notice any descrepancies). The values indicated for Go are the ones that can be passed to a GPConfig struct. For Python some parameters have to be passed in the fit method.

The most important parameter is called flavor. It determines what kind of model to use. It can take one of the following values:

vanilla: trains a single genetic programming instance.
boosting: trains a gradient boosting machine that uses genetic programming instances as weak learners.

Genetic programming parameters

Name	CLI	Go	Python	Default value
Loss metric; is used to if the task is classification or regression	`loss`	`LossMetricName`	`loss_metric`	mae (for Python `XGPClassifier` defaults to logloss)
Evaluation metric	`eval`	`EvalMetricName`	`eval_metric` (in `fit`)	Same as loss metric
Parsimony coefficient	`parsimony`	`ParsimonyCoefficient`	`parsimony_coeff`	0.00001
Polish the best program	`polish`	`PolishBest`	`polish_best`	true
Authorized functions	`funcs`	`Funcs`	`funcs`	sum,sub,mul,div
Constant minimum	`const_min`	`ConstMin`	`const_min`	-5
Constant maximum	`const_max`	`ConstMax`	`const_max`	5
Constant probability	`p_const`	`PConst`	`p_const`	0.5
Full initialization probability	`p_full`	`PFull`	`p_full`	0.5
Terminal probability	`p_leaf`	`PLeaf`	`p_leaf`	0.3
Minimum height	`min_height`	`MinHeight`	`min_height`	3
Maximum height	`max_height`	`MaxHeight`	`max_height`	5

Because XGP doesn't require the loss metric to be differentiable you can use any loss metric available. If you don't specify an evaluation metric then it will default to using the loss metric. XGP uses ramped half-and-half initialization; the full initialization probability determines the probability of using full initialization and consequently the probability of using grow initialization.

Genetic algorithm parameters

Name	CLI	Go	Python	Default value
Number of populations	`pops`	`NPopulations`	`n_populations`	1
Number of individuals per population	`indis`	`NIndividuals`	`n_individuals`	50
Number of generations	`gens`	`NGenerations`	`n_generations`	30
Hoist mutation probability	`p_hoist_mut`	`PHoistMutation`	`p_hoist_mutation`	0.1
Subtree mutation probability	`p_sub_mut`	`PSubtreeMutation`	`p_sub_tree_mutation`	0.1
Point mutation probability	`p_point_mut`	`PPointMutation`	`p_point_mutation`	0.1
Point mutation rate	`point_mut_rate`	`PointMutationRate`	`point_mutation_rate`	0.3
Subtree crossover probability	`p_sub_cross`	`PSubtreeCrossover`	`p_sub_tree_crossover`	0.5

Ensemble learning parameters

Ensemble learning is done via the meta package. For Python and the CLI you can use the flavor parameter to switch regime. For Go you have to initialize the desired struct yourself with the appropriate method (for example initialize the GradientBoosting struct with the NewGradientBoosting method).

Name	CLI	Go	Python	Default value
Number of rounds	`rounds`	`nRounds`	`n_rounds`	100
Number of early stopping rounds	`nEarlyStoppingRounds`	`early_stopping`	`n_early_stopping_rounds`	5
Learning rate	`learning_rate`	`learningRate`	`learning_rate`	0.08
Use line search	`line_search`	`lineSearcher`	`line_search`	✅
Row sampling	`row_sampling`	`rowSampling \|`row_sampling`	1
Column sampling	`col_sampling`	colSampling`\|`col_sampling`	1
Use best rounds	`use_best`	`useBest`	`use_best_rounds`	✅
Monitoring frequency	`monitor_every`	monitorEvery	`monitor_every`	1

Other parameters

Name	CLI	Go	Python	Default value
Random number seed	`seed`	`Seed`	`seed`	Random
Verbose	`verbose`	`verbose`	`verbose`	✅

Loss metrics

Genetic programming directly minimises a loss metric. Because the optimization is done with a genetic algorithm the loss metric doesn't have to be differentiable. Whether the task is classification or regression is thus determined from the loss metric. This is similar to how XGBoost and LightGBM handle things.

Each loss metric has a short name that you can use whether you are using the CLI, Go, or Python. You can also use these short names to evaluate the performance of the model. For example you might want to optimise the ROC AUC while also keeping track of the accuracy.

Name	Short name	Task
Logloss	logloss	Classification
Accuracy	accuracy	Classification
Precision	precision	Classification
Recall	recall	Classification
F1-score	f1	Classification
ROC AUC	roc_auc	Classification
Mean absolute error	mae	Regression
Mean squared error	mse	Regression
Root mean squared error	rmse	Regression
R2	r2	Regression
Absolute Pearson correlation	pearson	Regression

Operators

The following table lists all the available operators. Regardless of from where it is being used from, functions should be passed to XGP by concatenating the short names of the functions with a comma. For example to use the natural logarithm and the multiplication use log,mul.

Code-wise the operators are all located in the op subpackage, of which the goal is to provide fast implementations for each operator. For the while the only accelerations that exist are the ones for the sum and the division which use assembly implementations made available by gonum/floats.

Name	Arity	Short name	Go struct
Absolute value	1	abs	`Abs`
Addition	2	add	`Add`
Cosine	1	cos	`Cos`
Division	2	div	`Div`
Inverse	1	inv	`Inv`
Maximum	2	max	`Max`
Minimum	2	min	`Min`
Multiplication	2	mul	`Mul`
Negative value	1	neg	`Neg`
Sine	1	sin	`Sin`
Square	2	square	`Square`
Subtraction	2	sub	`Sub`

Safe-division is used, meaning that if a denominator is 0 then the result will default to 1.