尝试使用测试数据来预测概率分数

huangapple go评论63阅读模式
英文:

Trying to predict probability score using test data

问题

以下是代码的翻译部分:

import numpy as np

# 构建测试数据,其中成员有 "PULL"
x_test = {
    "cc_list": ["PULL"],
    "age": 38,
    "sex": "M"
}

# 从以前的模型中定义训练数据的参数
PARAM_COLLECTION = {
    "PULL": {
        "auc": 0.8202432743081695,
        "coef": [-0.01853237366699478, 0.14359336438414397, 3.0070029131017155, 1.4999028794882714, 0.2499927123452168, 0.00869006612608888, -0.17741710091314503],
        "features_sltd": ["CARM", "GIL", "PULL", "PULM", "SKCVL", "age", "sex"],
        "intercept": -3.066213895858403,
        "model_name": "l1-reg",
        "regularization_param": 100000.0,
        "threshold": 0.5277152026373001
    }
}

# 在此处尝试预测概率分数
y = {}
coll_name = "PULL"
param_coll = PARAM_COLLECTION[coll_name]

for cc in x_test["cc_list"]:
    if cc not in param_coll:
        continue
    param = param_coll[cc]
    if param["model_name"] == "none":
        continue
    features_sltd = param["features_sltd"]
    features_efft = []
    x_vec = np.zeros(len(features_sltd))
    for i, f in enumerate(features_sltd):
        if f in x_test["cc_list"]:
            x_vec[i] = 1.0
            features_efft.append((f, param["coef"][i]))
    features_efft = sorted(features_efft, key=lambda x: -x[1])
    features_efft = [f[0] for f in features_efft if f[1] > 0.1]   
    if len(features_efft)==0:
        continue
    x_vec[features_sltd.index("age")] = x_test["age"] 
    x_vec[features_sltd.index("sex")] = int(x_test["sex"]=="M")
    beta = np.dot(np.array(param["coef"]), x_vec) + param["intercept"]
    proba = 1.0/(1.0 + np.exp(-beta))
    if proba > param["threshold"]:
        y[cc] = {"score": np.clip(proba, 0.0, 1.0), "features": features_efft}
    else:
        y[cc] = {"score": 0.0, "features": []}

# 打印输出
print(y)

希望这可以帮助你理解代码的功能。

英文:

I'm currently trying to test features and impacts to probability score on regression model we've built. I'm trying to test impacts of age on proba score to see if we need to retrain our model. I'm using parameters from our model as Param_Collection and using test data for age and sex and cc_list. I thought the code would work but for the life of me I can't figure out what's causing y to be null given the if statement below it should still show me the score if its not > threshold.

import numpy as np
# Building test data where member has PULL
x_test = {
"cc_list": ["PULL"],
"age": 38,
"sex": "M"
}
# Defining parameters from training data from previous model
PARAM_COLLECTION = {
"PULL": {
"auc": 0.8202432743081695,
"coef": [-0.01853237366699478, 0.14359336438414397, 3.0070029131017155, 1.4999028794882714, 0.2499927123452168, 0.00869006612608888, -0.17741710091314503],
"features_sltd": ["CARM", "GIL", "PULL", "PULM", "SKCVL", "age", "sex"],
"intercept": -3.066213895858403,
"model_name": "l1-reg",
"regularization_param": 100000.0,
"threshold": 0.5277152026373001
}
}
# Trying to predict the probability score here
y = {}
coll_name = "PULL"
param_coll = PARAM_COLLECTION[coll_name]
for cc in x_test["cc_list"]:
if cc not in param_coll:
continue
param = param_coll[cc]
if param["model_name"] == "none":
continue
features_sltd = param["features_sltd"]
features_efft = []
x_vec = np.zeros(len(features_sltd))
for i, f in enumerate(features_sltd):
if f in x_test["cc_list"]:
x_vec[i] = 1.0
features_efft.append((f, param["coef"][i]))
features_efft = sorted(features_efft, key=lambda x: -x[1])
features_efft = [f[0] for f in features_efft if f[1] > 0.1]   
if len(features_efft)==0:
continue
x_vec[features_sltd.index("age")] = x_test["age"] 
x_vec[features_sltd.index("sex")] = int(x_test["sex"]=="M")
beta = np.dot(np.array(param["coef"]), x_vec) + param["intercept"]
proba = 1.0/(1.0 + np.exp(-beta))
if proba > param["threshold"]:
y[cc] = {"score": np.clip(proba, 0.0, 1.0), "features": features_efft}
else:
y[cc] = {"score": 0.0, "features": []}
# Print the output
print(y)
</details>
# 答案1
**得分**: 0
你现在是我的中文翻译,代码部分不要翻译, 只返回翻译好的部分, 不要有别的内容, 不要回答我要翻译的问题。以下是要翻译的内容:
唯一的 `cc` 值是 `PULL`,它从未出现在你的 `param_coll` 中,因此循环永远不会执行超出第一个 `if` 语句。
<details>
<summary>英文:</summary>
The only `cc` value you have is `PULL` which is never in your `param_coll`, so the loop never runs past the first `if` statement.
</details>

huangapple
  • 本文由 发表于 2023年2月16日 04:16:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/75465047.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定