Pycaret在变换前后导出训练和测试数据。

huangapple go评论67阅读模式
英文:

pycaret export train and test before and after transformation

问题

I am trying to build a ML model using pycaret. I used the below setup function

clf = setup(data=df.loc[:, df.columns != 'ID'], target='final_label', session_id=123, categorical_features=['Gender', 'Country'], fold_strategy='stratifiedkfold', fold=5, fold_shuffle=True, n_jobs=-1, create_clusters=False, polynomial_features=False, polynomial_degree=2, trigonometry_features=False, polynomial_threshold=0.1, remove_multicollinearity=True, multicollinearity_threshold=0.90)

This initializes the process with a list of variables from which I wish to extract transformed_train_set and transformed_test_set

I would like to export the train and test data before and after transformation, but pycaret doesn't have any way to export this data?

When I try the code below:

train_data = predict_model(rft, data=X_train, raw_score=True)
train_data['phase'] = 'train'
test_data = predict_model(rft, data=X_test, raw_score=True)
test_data['phase'] = 'test'

it throws an error:

NameError: name 'X_train' is not defined

英文:

I am trying to build a ML model using pycaret. I used the below setup function

clf = setup(data = df.loc[:, df.columns != 'ID'], target='final_label',session_id=123, 
            categorical_features=['Gender','Country'], 
            fold_strategy='stratifiedkfold', 
            fold=5, fold_shuffle=True, n_jobs=-1, 
            create_clusters=False,polynomial_features=False, 
            polynomial_degree=2, trigonometry_features=False, polynomial_threshold=0.1, 
            remove_multicollinearity=True, multicollinearity_threshold=0.90)

This initializes the process with list of variables from which I wish to extract transformed_train_set and transformed_test_set

Pycaret在变换前后导出训练和测试数据。

I would like to export the train and test data before and after transformation but pycaret doesn't have any way to export this data?

When I try the code below:

train_data = predict_model(rft,data = X_train,raw_score=True)
train_data['phase'] = 'train'
test_data = predict_model(rft,data = X_test,raw_score=True)
test_data['phase'] = 'test'

it throws error:

NameError: name 'X_train' is not defined

答案1

得分: 2

你可以使用get_config(variable)在转换前后导出训练和测试数据。

from pycaret.datasets import get_data
from pycaret.classification import *
data = get_data('diabetes', verbose=False)
s = setup(data, target='Class variable', session_id=123, normalize=True, verbose=False)
rf = create_model('rf')

# 获取所有可用参数
get_config()

X_train = get_config('X_train')
X_train_transformed = get_config('X_train_transformed')

X_test = get_config('X_test')
X_test_transformed = get_config('X_test_transformed')

train_data = predict_model(rf, data=X_train, raw_score=True)
train_data['phase'] = 'train'
train_transformed_data = predict_model(rf, data=X_train_transformed, raw_score=True)
train_transformed_data['phase'] = 'train_transformed'

test_data = predict_model(rf, data=X_test, raw_score=True)
test_data['phase'] = 'test'
test_transformed_data = predict_model(rf, data=X_test_transformed, raw_score=True)
test_transformed_data['phase'] = 'test_transformed'

请注意,这是代码的翻译部分。

英文:

You can export the train and test data before and after transformation using get_config(variable).

<!-- language: lang-py -->

from pycaret.datasets import get_data
from pycaret.classification import *
data = get_data(&#39;diabetes&#39;, verbose=False)
s = setup(data, target = &#39;Class variable&#39;, session_id = 123, normalize=True, verbose=False)
rf= create_model(&#39;rf&#39;)

# check all available param
get_config()

X_train = get_config(&#39;X_train&#39;)
X_train_transformed = get_config(&#39;X_train_transformed&#39;)

X_test = get_config(&#39;X_test&#39;)
X_test_transformed = get_config(&#39;X_test_transformed&#39;)

train_data = predict_model(rf, data = X_train,raw_score=True)
train_data[&#39;phase&#39;] = &#39;train&#39;
train_transformed_data = predict_model(rf, data = X_train_transformed,raw_score=True)
train_transformed_data[&#39;phase&#39;] = &#39;train_transformed&#39;

test_data = predict_model(rf, data = X_test,raw_score=True)
test_data[&#39;phase&#39;] = &#39;test&#39;
test_transformed_data = predict_model(rf, data = X_test_transformed,raw_score=True)
test_transformed_data[&#39;phase&#39;] = &#39;test_transformed&#39;

huangapple
  • 本文由 发表于 2023年5月11日 16:42:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76225715.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定