Go usage
Installation
Once you have installed Go, you can install XGO like any other Go package.
go get github.com/MaxHalford/xgp
Usage
Although the full API is available on godoc, only a subset of it is relevant if all you want to do is train a program on a dataset.
Instantiation
The core struct for learning in XGP is the GP
. A GP
encapsulates all the logic for generating, evaluating, and evolving programs. Although you can instantiate an GP
directly, you can (and should) do it by instantiating a GPConfig
and calling it's NewGP
method. You can also use the NewDefaultGPConfig
method to instantiate a GPConfig
with the default values outlines in the training parameters section. Even if you don't want to use the default values, it's a good idea to use NewDefaultGPConfig
and then to set the fields you want to modify afterwards.
var config = NewDefaultGPConfig() config.LossMetric = metrics.Accuracy{} config.Individuals = 42 config.Funcs = "cos,sin,exp" var estimator = config.NewGP()
The GPConfig
struct fields exactly match the ones indicated in the training parameters section.
Training
Once you have an GP
, you're ready to call to it's Fit
method to train it on a dataset. Here is the signature of the Fit
method:
func (est *GP) Fit( // Required arguments X [][]float64, Y []float64, // Optional arguments (can safely be nil) W []float64, XVal [][]float64, YVal []float64, WVal []float64, verbose bool, ) error
Just like for the CLI, the only required arguments to the GP
's Fit
method are a matrix of features X
and a list of targets Y
. W
can be used to weight the samples in X
during program evaluation, which is particularly useful for higher-level learning algorithms such as boosting. One important thing to notice is that X
and XVal
should be ordered column-wise; that is X[0]
should access the first column in the dataset, not the first row.
Warning
For the while XGP does not handle categorical data. You should preemptively encode the categorical features in your dataset before feeding it to XGP. The recommended way is to use label encoding for ordinal data and one-hot encoding for non-ordinal data.
Warning
For the while XGP does not handle missing values.
Like the val_set
argument in the CLI, XVal
, YVal
, and WVal
can be used to track the performance of the best program on out-of-bag data. notifyEvery
can be used to indicate at what frequency (in terms of genetic algorithm generations) progress should be displayed.
You can extract the best obtained Program
with the BestProgram
method.
var best = gp.BestProgram()
Finally the Fit
method returns an error which you should handle.
Prediction
Once the Fit
method has been called, the Predict
method can be used to make predictions given a set of features.
func (est GP) Predict(X [][]float64, predictProba bool) ([]float64, error)
The columns in X
should be ordered in the same way as in the training set. The proba
argument can be used to indicate if probabilities should be returned in the case of classification.