英文:
How to build Catboost C Evaluation Library API?
问题
我必须在一些编程语言中使用Catboost模型,包括Golang和Python。对于性能和兼容性来说,最好的选择是使用一个评估库,该库可以是C或C++ API。我按照官方文档编译了C API,但是在这个过程中遇到了很多问题。
在尝试创建C语言评估库时,我们遇到了以下问题:
error: variable has incomplete type 'ModelCalcerHandle' (aka 'void')
ModelCalcerHandle modelHandle;
c_wrapper.c:16:13: warning: incompatible pointer types passing 'float (*)[3]' to parameter of type 'const float **' [-Wincompatible-pointer-types]
&floatFeatures, 3,
^~~~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:151:19: note: passing argument to parameter 'floatFeatures' here
const float** floatFeatures, size_t floatFeaturesSize,
^
c_wrapper.c:17:13: warning: incompatible pointer types passing 'char *(*)[4]' to parameter of type 'const char ***' [-Wincompatible-pointer-types]
&catFeatures, 4,
^~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:152:19: note: passing argument to parameter 'catFeatures' here
const char*** catFeatures, size_t catFeaturesSize,
^
c_wrapper.c:18:13: warning: incompatible pointer types passing 'double (*)[1]' to parameter of type 'double *' [-Wincompatible-pointer-types]
&result, 1
^~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:153:13: note: passing argument to parameter 'result' here
double* result, size_t resultSize);
解决方案:
- 我们通过重新定义
modelHandle
变量来解决问题#1:
ModelCalcerHandle *modelHandle = ModelCalcerCreate();
在进行这个更改后,我们能够编译C程序,但是我们得到了一个新的错误:
[1] 6489 segmentation fault ./program
- 分段错误与问题#2中列出的警告有关。我们不得不重新定义变量来解决它:
float floatFeaturesRaw[100];
const float *floatFeatures = floatFeaturesRaw;
const char *catFeaturesRaw[2] = {"1", "2"};
const char **catFeatures = catFeaturesRaw;
double resultRaw[1];
double *result = resultRaw;
和
if (!CalcModelPredictionSingle(
modelHandle,
&floatFeatures, 3,
&catFeatures, 4,
result, 1)) //We remove `&`
{
printf("CalcModelPrediction error message: %s\n", GetErrorString());
}
我将在评论中添加完整的解决方案,包括代码修复和如何编译C代码的说明。
英文:
I had to use a Catboost model in some programming languages, Golang and Python. The best option (for performance and compatibility) is to use an evaluation library which can be a C or C++ API. I followed the official documentation to compile the C API, but it has a lot of problems to solve so that work.
These are the problems we encountered while trying to create the evaluation library in C:
error: variable has incomplete type 'ModelCalcerHandle' (aka 'void')
ModelCalcerHandle modelHandle;
c_wrapper.c:16:13: warning: incompatible pointer types passing 'float (*)[3]' to parameter of type 'const float **' [-Wincompatible-pointer-types]
&floatFeatures, 3,
^~~~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:151:19: note: passing argument to parameter 'floatFeatures' here
const float** floatFeatures, size_t floatFeaturesSize,
^
c_wrapper.c:17:13: warning: incompatible pointer types passing 'char *(*)[4]' to parameter of type 'const char ***' [-Wincompatible-pointer-types]
&catFeatures, 4,
^~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:152:19: note: passing argument to parameter 'catFeatures' here
const char*** catFeatures, size_t catFeaturesSize,
^
c_wrapper.c:18:13: warning: incompatible pointer types passing 'double (*)[1]' to parameter of type 'double *' [-Wincompatible-pointer-types]
&result, 1
^~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:153:13: note: passing argument to parameter 'result' here
double* result, size_t resultSize);
Solution:
- We have solved problem #1 by redefining the
modelHandle
variable as:
ModelCalcerHandle *modelHandle = ModelCalcerCreate();
After this change it was posible to compile the C program, but we got a new error:
[1] 6489 segmentation fault ./program
- The segmentation fault is related to the warnings listed in issue #2. We had to redefine the variables to solve it:
float floatFeaturesRaw[100];
const float *floatFeatures = floatFeaturesRaw;
const char *catFeaturesRaw[2] = {"1", "2"};
const char **catFeatures = catFeaturesRaw;
double resultRaw[1];
double *result = resultRaw;
and
if (!CalcModelPredictionSingle(
modelHandle,
&floatFeatures, 3,
&catFeatures, 4,
result, 1)) //We remove `&`
{
printf("CalcModelPrediction error message: %s\n", GetErrorString());
}
I'll add the complete solution, from code fixes to how to compile C code, in a comment.
答案1
得分: 2
以下是完整的解决方案:
- 克隆catboost存储库:
git clone https://github.com/catboost/catboost.git
-
打开CatBoost存储库的本地副本中的catboost目录。
-
构建评估库(我选择了共享库,但您可以选择所需的)。在我的情况下,我不得不更改
--target-platform
参数,因为我使用的是macOS Ventura 13.1的Mac M1,clang版本为14.0.0:
./ya make -r catboost/libs/model_interface --target-platform CLANG14-DARWIN-ARM64
- 创建C文件。固定的C示例代码:
#include <stdio.h>
#include <c_api.h>
int main()
{
float floatFeaturesRaw[3] = {0, 89, 1};
const float *floatFeatures = floatFeaturesRaw;
const char *catFeaturesRaw[4] = {"Others", "443_HTTPS", "6", "24"};
const char **catFeatures = catFeaturesRaw;
double resultRaw[4];
double *result = resultRaw;
ModelCalcerHandle *modelHandle = ModelCalcerCreate();
if (!LoadFullModelFromFile(modelHandle, "catboost_model"))
{
printf("LoadFullModelFromFile error message: %s\n", GetErrorString());
}
SetPredictionType(modelHandle, 3);
if (!CalcModelPredictionSingle(
modelHandle,
floatFeatures, 3,
catFeatures, 4,
result, 4))
{
printf("CalcModelPrediction error message: %s\n", GetErrorString());
}
printf("%f\n", result[0]);
printf("%f\n", result[1]);
printf("%f\n", result[2]);
printf("%f\n", result[3]);
ModelCalcerDelete(modelHandle);
}
需要考虑的事项:
- 我将
SetPredictionType
设置为APT_PROBABILITY。 - 我们的模型预测多个类别,所以
result[4]
。 - 我们只需要一次预测一条记录,所以我们使用
CalcModelPredictionSingle
方法。
- 编译C代码:
gcc -v -o program.out c_code.c -l catboostmodel -I /path/to/catboost/repo/catboost/catboost/libs/model_interface/ -L /path/to/catboost/repo/catboost/catboost/libs/model_interface/
**重要提示:**确保没有显示任何警告或错误消息。
- 现在可以运行它:
**重要提示:**确保catboost模型文件与program.out
位于相同的路径中。
./program.out
英文:
Here is the complete solution:
- Clone catboost repo:
git clone https://github.com/catboost/catboost.git
-
Open the catboost directory from the local copy of the CatBoost repository.
-
Build the evaluation library (I've chosen the shared library, but you can select what you need). In my case I had to change the
--target-platform
argument, I was using a Mac M1 with macOS Ventura 13.1 and the clang version was 14.0.0:
./ya make -r catboost/libs/model_interface --target-platform CLANG14-DARWIN-ARM64
- Create the C file. Fixed C sample code:
#include <stdio.h>
#include <c_api.h>
int main()
{
float floatFeaturesRaw[3] = {0, 89, 1};
const float *floatFeatures = floatFeaturesRaw;
const char *catFeaturesRaw[4] = {"Others", "443_HTTPS", "6", "24"};
const char **catFeatures = catFeaturesRaw;
double resultRaw[4];
double *result = resultRaw;
ModelCalcerHandle *modelHandle = ModelCalcerCreate();
if (!LoadFullModelFromFile(modelHandle, "catboost_model"))
{
printf("LoadFullModelFromFile error message: %s\n", GetErrorString());
}
SetPredictionType(modelHandle, 3);
if (!CalcModelPredictionSingle(
modelHandle,
floatFeatures, 3,
catFeatures, 4,
result, 4))
{
printf("CalcModelPrediction error message: %s\n", GetErrorString());
}
printf("%f\n", result[0]);
printf("%f\n", result[1]);
printf("%f\n", result[2]);
printf("%f\n", result[3]);
ModelCalcerDelete(modelHandle);
}
To Consider:
- I have set
SetPredictionType
to APT_PROBABILITY - Our model predicts multiple classes, so
result[4]
. - We only need to predict one record at a time, so we use
CalcModelPredictionSingle
method.
- Compile the C code:
gcc -v -o program.out c_code.c -l catboostmodel -I /path/to/catboost/repo/catboost/catboost/libs/model_interface/ -L /path/to/catboost/repo/catboost/catboost/libs/model_interface/
IMPORTANT: Make sure that no warning or error messages have been displayed.
- Now you can run it:
IMPORTANT: Make sure the catboost model file is in the same path as program.out
.
./program.out
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论