如何构建Catboost C评估库API?

huangapple go评论81阅读模式
英文:

How to build Catboost C Evaluation Library API?

问题

我必须在一些编程语言中使用Catboost模型,包括Golang和Python。对于性能和兼容性来说,最好的选择是使用一个评估库,该库可以是C或C++ API。我按照官方文档编译了C API,但是在这个过程中遇到了很多问题。

在尝试创建C语言评估库时,我们遇到了以下问题:

error: variable has incomplete type 'ModelCalcerHandle' (aka 'void')
    ModelCalcerHandle modelHandle;
c_wrapper.c:16:13: warning: incompatible pointer types passing 'float (*)[3]' to parameter of type 'const float **' [-Wincompatible-pointer-types]
            &floatFeatures, 3,
            ^~~~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:151:19: note: passing argument to parameter 'floatFeatures' here
    const float** floatFeatures, size_t floatFeaturesSize,
                  ^
c_wrapper.c:17:13: warning: incompatible pointer types passing 'char *(*)[4]' to parameter of type 'const char ***' [-Wincompatible-pointer-types]
            &catFeatures, 4,
            ^~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:152:19: note: passing argument to parameter 'catFeatures' here
    const char*** catFeatures, size_t catFeaturesSize,
                  ^
c_wrapper.c:18:13: warning: incompatible pointer types passing 'double (*)[1]' to parameter of type 'double *' [-Wincompatible-pointer-types]
            &result, 1
            ^~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:153:13: note: passing argument to parameter 'result' here
    double* result, size_t resultSize);

解决方案:

  1. 我们通过重新定义modelHandle变量来解决问题#1:
ModelCalcerHandle *modelHandle = ModelCalcerCreate();

在进行这个更改后,我们能够编译C程序,但是我们得到了一个新的错误:

[1]    6489 segmentation fault  ./program
  1. 分段错误与问题#2中列出的警告有关。我们不得不重新定义变量来解决它:
float floatFeaturesRaw[100];
const float *floatFeatures = floatFeaturesRaw;
const char *catFeaturesRaw[2] = {"1", "2"};
const char **catFeatures = catFeaturesRaw;
double resultRaw[1];
double *result = resultRaw;

if (!CalcModelPredictionSingle(
        modelHandle,
        &floatFeatures, 3,
        &catFeatures, 4,
        result, 1)) //We remove `&`
{
   printf("CalcModelPrediction error message: %s\n", GetErrorString());
}

我将在评论中添加完整的解决方案,包括代码修复和如何编译C代码的说明。

英文:

I had to use a Catboost model in some programming languages, Golang and Python. The best option (for performance and compatibility) is to use an evaluation library which can be a C or C++ API. I followed the official documentation to compile the C API, but it has a lot of problems to solve so that work.

These are the problems we encountered while trying to create the evaluation library in C:

error: variable has incomplete type 'ModelCalcerHandle' (aka 'void')
    ModelCalcerHandle modelHandle;
c_wrapper.c:16:13: warning: incompatible pointer types passing 'float (*)[3]' to parameter of type 'const float **' [-Wincompatible-pointer-types]
            &floatFeatures, 3,
            ^~~~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:151:19: note: passing argument to parameter 'floatFeatures' here
    const float** floatFeatures, size_t floatFeaturesSize,
                  ^
c_wrapper.c:17:13: warning: incompatible pointer types passing 'char *(*)[4]' to parameter of type 'const char ***' [-Wincompatible-pointer-types]
            &catFeatures, 4,
            ^~~~~~~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:152:19: note: passing argument to parameter 'catFeatures' here
    const char*** catFeatures, size_t catFeaturesSize,
                  ^
c_wrapper.c:18:13: warning: incompatible pointer types passing 'double (*)[1]' to parameter of type 'double *' [-Wincompatible-pointer-types]
            &result, 1
            ^~~~~~~
/Users/eli/workspace/test_c_api/catboost/catboost/libs/model_interface/c_api.h:153:13: note: passing argument to parameter 'result' here
    double* result, size_t resultSize);

Solution:

  1. We have solved problem #1 by redefining the modelHandle variable as:
ModelCalcerHandle *modelHandle = ModelCalcerCreate();

After this change it was posible to compile the C program, but we got a new error:

[1]    6489 segmentation fault  ./program
  1. The segmentation fault is related to the warnings listed in issue #2. We had to redefine the variables to solve it:
float floatFeaturesRaw[100];
const float *floatFeatures = floatFeaturesRaw;
const char *catFeaturesRaw[2] = {"1", "2"};
const char **catFeatures = catFeaturesRaw;
double resultRaw[1];
double *result = resultRaw;

and

if (!CalcModelPredictionSingle(
        modelHandle,
        &floatFeatures, 3,
        &catFeatures, 4,
        result, 1)) //We remove `&`
{
   printf("CalcModelPrediction error message: %s\n", GetErrorString());
}

I'll add the complete solution, from code fixes to how to compile C code, in a comment.

答案1

得分: 2

以下是完整的解决方案:

  1. 克隆catboost存储库:

git clone https://github.com/catboost/catboost.git

  1. 打开CatBoost存储库的本地副本中的catboost目录。

  2. 构建评估库(我选择了共享库,但您可以选择所需的)。在我的情况下,我不得不更改--target-platform参数,因为我使用的是macOS Ventura 13.1的Mac M1,clang版本为14.0.0:

./ya make -r catboost/libs/model_interface --target-platform CLANG14-DARWIN-ARM64
  1. 创建C文件。固定的C示例代码:
#include <stdio.h>
#include <c_api.h>

int main()
{
    float floatFeaturesRaw[3] = {0, 89, 1};
    const float *floatFeatures = floatFeaturesRaw;
    const char *catFeaturesRaw[4] = {"Others", "443_HTTPS", "6", "24"};
    const char **catFeatures = catFeaturesRaw;
    double resultRaw[4];
    double *result = resultRaw;

    ModelCalcerHandle *modelHandle = ModelCalcerCreate();
    if (!LoadFullModelFromFile(modelHandle, "catboost_model"))
    {
        printf("LoadFullModelFromFile error message: %s\n", GetErrorString());
    }
    SetPredictionType(modelHandle, 3);
    if (!CalcModelPredictionSingle(
            modelHandle,
            floatFeatures, 3,
            catFeatures, 4,
            result, 4))
    {
        printf("CalcModelPrediction error message: %s\n", GetErrorString());
    }
    printf("%f\n", result[0]);
    printf("%f\n", result[1]);
    printf("%f\n", result[2]);
    printf("%f\n", result[3]);
    ModelCalcerDelete(modelHandle);
}

需要考虑的事项:

  • 我将SetPredictionType设置为APT_PROBABILITY。
  • 我们的模型预测多个类别,所以result[4]
  • 我们只需要一次预测一条记录,所以我们使用CalcModelPredictionSingle方法。
  1. 编译C代码:
gcc -v -o program.out c_code.c -l catboostmodel -I /path/to/catboost/repo/catboost/catboost/libs/model_interface/ -L /path/to/catboost/repo/catboost/catboost/libs/model_interface/

**重要提示:**确保没有显示任何警告或错误消息。

  1. 现在可以运行它:

**重要提示:**确保catboost模型文件与program.out位于相同的路径中。

./program.out
英文:

Here is the complete solution:

  1. Clone catboost repo:

git clone https://github.com/catboost/catboost.git

  1. Open the catboost directory from the local copy of the CatBoost repository.

  2. Build the evaluation library (I've chosen the shared library, but you can select what you need). In my case I had to change the --target-platform argument, I was using a Mac M1 with macOS Ventura 13.1 and the clang version was 14.0.0:

./ya make -r catboost/libs/model_interface --target-platform CLANG14-DARWIN-ARM64
  1. Create the C file. Fixed C sample code:
#include &lt;stdio.h&gt;
#include &lt;c_api.h&gt;

int main()
{
    float floatFeaturesRaw[3] = {0, 89, 1};
    const float *floatFeatures = floatFeaturesRaw;
    const char *catFeaturesRaw[4] = {&quot;Others&quot;, &quot;443_HTTPS&quot;, &quot;6&quot;, &quot;24&quot;};
    const char **catFeatures = catFeaturesRaw;
    double resultRaw[4];
    double *result = resultRaw;

    ModelCalcerHandle *modelHandle = ModelCalcerCreate();
    if (!LoadFullModelFromFile(modelHandle, &quot;catboost_model&quot;))
    {
        printf(&quot;LoadFullModelFromFile error message: %s\n&quot;, GetErrorString());
    }
    SetPredictionType(modelHandle, 3);
    if (!CalcModelPredictionSingle(
            modelHandle,
            floatFeatures, 3,
            catFeatures, 4,
            result, 4))
    {
        printf(&quot;CalcModelPrediction error message: %s\n&quot;, GetErrorString());
    }
    printf(&quot;%f\n&quot;, result[0]);
    printf(&quot;%f\n&quot;, result[1]);
    printf(&quot;%f\n&quot;, result[2]);
    printf(&quot;%f\n&quot;, result[3]);
    ModelCalcerDelete(modelHandle);
}

To Consider:

  • I have set SetPredictionType to APT_PROBABILITY
  • Our model predicts multiple classes, so result[4].
  • We only need to predict one record at a time, so we use CalcModelPredictionSingle method.
  1. Compile the C code:
gcc -v -o program.out c_code.c -l catboostmodel -I /path/to/catboost/repo/catboost/catboost/libs/model_interface/ -L /path/to/catboost/repo/catboost/catboost/libs/model_interface/

IMPORTANT: Make sure that no warning or error messages have been displayed.

  1. Now you can run it:

IMPORTANT: Make sure the catboost model file is in the same path as program.out.

./program.out

huangapple
  • 本文由 发表于 2022年12月30日 22:11:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/74962479.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定