Skip to content

Training parameters

Overview

The following tables gives an overview of all the parameters that can be used for training XGP. The defaults are the same regardless of where you're using XGP from (please open an issue if you notice any descrepancies). The values indicated for Go are the ones that can be passed to a GPConfig struct. For Python some parameters have to be passed in the fit method.

The most important parameter is called flavor. It determines what kind of model to use. It can take one of the following values:

  • vanilla: trains a single genetic programming instance.
  • boosting: trains a gradient boosting machine that uses genetic programming instances as weak learners.

Genetic programming parameters

Name CLI Go Python Default value
Loss metric; is used to if the task is classification or regression loss LossMetricName loss_metric mae (for Python XGPClassifier defaults to logloss)
Evaluation metric eval EvalMetricName eval_metric (in fit) Same as loss metric
Parsimony coefficient parsimony ParsimonyCoefficient parsimony_coeff 0.00001
Polish the best program polish PolishBest polish_best true
Authorized functions funcs Funcs funcs sum,sub,mul,div
Constant minimum const_min ConstMin const_min -5
Constant maximum const_max ConstMax const_max 5
Constant probability p_const PConst p_const 0.5
Full initialization probability p_full PFull p_full 0.5
Terminal probability p_leaf PLeaf p_leaf 0.3
Minimum height min_height MinHeight min_height 3
Maximum height max_height MaxHeight max_height 5

Because XGP doesn't require the loss metric to be differentiable you can use any loss metric available. If you don't specify an evaluation metric then it will default to using the loss metric. XGP uses ramped half-and-half initialization; the full initialization probability determines the probability of using full initialization and consequently the probability of using grow initialization.

Genetic algorithm parameters

Name CLI Go Python Default value
Number of populations pops NPopulations n_populations 1
Number of individuals per population indis NIndividuals n_individuals 50
Number of generations gens NGenerations n_generations 30
Hoist mutation probability p_hoist_mut PHoistMutation p_hoist_mutation 0.1
Subtree mutation probability p_sub_mut PSubtreeMutation p_sub_tree_mutation 0.1
Point mutation probability p_point_mut PPointMutation p_point_mutation 0.1
Point mutation rate point_mut_rate PointMutationRate point_mutation_rate 0.3
Subtree crossover probability p_sub_cross PSubtreeCrossover p_sub_tree_crossover 0.5

Ensemble learning parameters

Ensemble learning is done via the meta package. For Python and the CLI you can use the flavor parameter to switch regime. For Go you have to initialize the desired struct yourself with the appropriate method (for example initialize the GradientBoosting struct with the NewGradientBoosting method).

Name CLI Go Python Default value
Number of rounds rounds nRounds n_rounds 100
Number of early stopping rounds nEarlyStoppingRounds early_stopping n_early_stopping_rounds 5
Learning rate learning_rate learningRate learning_rate 0.08
Use line search line_search lineSearcher line_search
Row sampling row_sampling rowSampling |row_sampling` 1
Column sampling col_sampling colSampling|col_sampling` 1
Use best rounds use_best useBest use_best_rounds
Monitoring frequency monitor_every monitorEvery monitor_every 1

Other parameters

Name CLI Go Python Default value
Random number seed seed Seed seed Random
Verbose verbose verbose verbose

Loss metrics

Genetic programming directly minimises a loss metric. Because the optimization is done with a genetic algorithm the loss metric doesn't have to be differentiable. Whether the task is classification or regression is thus determined from the loss metric. This is similar to how XGBoost and LightGBM handle things.

Each loss metric has a short name that you can use whether you are using the CLI, Go, or Python. You can also use these short names to evaluate the performance of the model. For example you might want to optimise the ROC AUC while also keeping track of the accuracy.

Name Short name Task
Logloss logloss Classification
Accuracy accuracy Classification
Precision precision Classification
Recall recall Classification
F1-score f1 Classification
ROC AUC roc_auc Classification
Mean absolute error mae Regression
Mean squared error mse Regression
Root mean squared error rmse Regression
R2 r2 Regression
Absolute Pearson correlation pearson Regression

Operators

The following table lists all the available operators. Regardless of from where it is being used from, functions should be passed to XGP by concatenating the short names of the functions with a comma. For example to use the natural logarithm and the multiplication use log,mul.

Code-wise the operators are all located in the op subpackage, of which the goal is to provide fast implementations for each operator. For the while the only accelerations that exist are the ones for the sum and the division which use assembly implementations made available by gonum/floats.

Name Arity Short name Go struct
Absolute value 1 abs Abs
Addition 2 add Add
Cosine 1 cos Cos
Division 2 div Div
Inverse 1 inv Inv
Maximum 2 max Max
Minimum 2 min Min
Multiplication 2 mul Mul
Negative value 1 neg Neg
Sine 1 sin Sin
Square 2 square Square
Subtraction 2 sub Sub

Safe-division is used, meaning that if a denominator is 0 then the result will default to 1.