Catboost model could be saved as standalone C++ code. This can ease an integration of a generated model into an application built from C++ sources, simplify porting the model to an architecture not direcly supported by CatBoost (eq. ARM), or allow manual exploration and editing of the model parameters by advanced users.
The exported model code contains complete data for the current trained model and apply_catboost_model() function which applies the model to a given dataset. The only current dependency for the code is CityHash library (NOTE: The exact revision under the link is required).
catboost fit --model-format CPP <other_fit_parameters>
By default model is saved into model.cpp file. One could alter the output name using -m key. If there is more that one model-format specified, then the .cpp extention will be added to the name provided after -m key.
model = CatBoost(<train_params>)
model.fit(train_pool)
model.save_model(OUTPUT_CPP_MODEL_PATH, format="CPP")
If the model was trained using only numerical features (no cat features), then the application function in generated code will have the following interface:
double ApplyCatboostModel(const std::vector<float>& features);
parameter | description |
---|---|
features | features of a single document to make prediction |
Prediction of the model for the document with given features.
The result is equivalent to the code below except it won't require linking of libcatboostmodel.<so|dll|dylib>.
#include <catboost/libs/model_interface/wrapped_calcer.h>
double ApplyCatboostModel(const std::vector<float>& features) {
ModelCalcerWrapper calcer("model.cbm");
return calcer.Calc(features, {});
}
C++11 support of non-static data member initializers and extended initializer lists
If the model was trained with categorical features present, then the application function in output code will be generated with the following interface:
double ApplyCatboostModel(const std::vector<float>& floatFeatures, const std::vector<std::string>& catFeatures);
parameter | description |
---|---|
floatFeatures | numerical features of a single document |
catFeatures | categorical features of a single document |
NOTE: You need to pass float and categorical features separately in the same order they appeared in the train dataset. For example if you had features f1,f2,f3,f4, where f2 and f4 were considered categorical, you need to pass here floatFeatures = {f1, f3}, catFeatures = {f2, f4}.
Prediction of the model for the document with given features.
The result is equivalent to the code below except it won't require linking of libcatboostmodel.<so|dll|dylib>.
#include <catboost/libs/model_interface/wrapped_calcer.h>
double ApplyCatboostModel(const std::vector<float>& floatFeatures, const std::vector<std::string>& catFeatures) {
ModelCalcerWrapper calcer("model.cbm");
return calcer.Calc(floatFeatures, catFeatures);
}
C++14 compiler with aggregate member initialization support. Tested compilers: g++ 5(5.4.1 20160904), clang++ 3.8.
- MultiClassification models are not supported.
- applyCatboostModel() function has reference implementation and may lack of performance comparing to native applicator of CatBoost, especially on large models and multiple of documents.
Q: Generated model results differ from native model when categorical features present A: Please check that CityHash version 1 is used. Exact required revision of C++ Google CityHash library. There is also proper CityHash implementation in Catboost repository. This is due other versions of CityHash may produce different hash code for the same string.