如何在Optuna中记录交叉验证中每个折叠的验证损失?

huangapple go评论90阅读模式
英文:

How to record each fold`s validation loss during cross-validation in Optuna?

问题

抱歉,我只会为您提供代码的翻译。以下是您提供的代码的翻译:

我正在使用[Toshihiko Yanase](https://stackoverflow.com/a/63242079/21220510)的代码来执行我的超参数优化器的交叉验证使用Optuna以下是我正在使用的代码

def objective(trial, train_loader, valid_loader):

    # 移除以下行。
    # train_loader, valid_loader = get_mnist()

    ...

    return accuracy


def objective_cv(trial):

    # 获取MNIST数据集。
    dataset = datasets.MNIST(DIR, train=True, download=True, transform=transforms.ToTensor())

    fold = KFold(n_splits=3, shuffle=True, random_state=0)
    scores = []
    for fold_idx, (train_idx, valid_idx) in enumerate(fold.split(range(len(dataset))):
        train_data = torch.utils.data.Subset(dataset, train_idx)
        valid_data = torch.utils.data.Subset(dataset, valid_idx)

        train_loader = torch.utils.data.DataLoader(
            train_data,
            batch_size=BATCHSIZE,
            shuffle=True,
        )
        valid_loader = torch.utils.data.DataLoader(
            valid_data,
            batch_size=BATCHSIZE,
            shuffle=True,
        )

        accuracy = objective(trial, train_loader, valid_loader)
        scores.append(accuracy)
    return np.mean(scores)


study = optuna.create_study(direction="maximize")
study.optimize(objective_cv, n_trials=20, timeout=600)

很抱歉,您的问题涉及非代码部分,我无法回答。如果您有任何其他需要翻译的代码,请随时提问。

英文:

I am using Toshihiko Yanase`s code for doing cross validation on my hyperparameter optimizer with Optuna. Here is the code that I am using:

def objective(trial, train_loader, valid_loader):

    # Remove the following line.
    # train_loader, valid_loader = get_mnist()

    ...

    return accuracy


def objective_cv(trial):

    # Get the MNIST dataset.
    dataset = datasets.MNIST(DIR, train=True, download=True, transform=transforms.ToTensor())

    fold = KFold(n_splits=3, shuffle=True, random_state=0)
    scores = []
    for fold_idx, (train_idx, valid_idx) in enumerate(fold.split(range(len(dataset)))):
        train_data = torch.utils.data.Subset(dataset, train_idx)
        valid_data = torch.utils.data.Subset(dataset, valid_idx)

        train_loader = torch.utils.data.DataLoader(
            train_data,
            batch_size=BATCHSIZE,
            shuffle=True,
        )
        valid_loader = torch.utils.data.DataLoader(
            valid_data,
            batch_size=BATCHSIZE,
            shuffle=True,
        )

        accuracy = objective(trial, train_loader, valid_loader)
        scores.append(accuracy)
    return np.mean(scores)


study = optuna.create_study(direction="maximize")
study.optimize(objective_cv, n_trials=20, timeout=600)

Unfortunately, using the code this way, it does not record each folds val loss to the Optuna dashboard. Is there a way to record each folds val loss to the Optuna dashboard?

答案1

得分: 1

每个拆分的验证损失可以记录在当前试验的Trial对象的system_attrs中。system_attrs可以在仪表板中看到,就像您所希望的那样。

具有所需功能的修改后代码如下:

def objective(trial, train_loader, valid_loader):
    
    # 移除以下行。
    # train_loader, valid_loader = get_mnist()
    
    ...
    
    return accuracy

def objective_cv(trial):
    
    # 获取MNIST数据集。
    dataset = datasets.MNIST(DIR, train=True, download=True, transform=transforms.ToTensor())

    fold = KFold(n_splits=3, shuffle=True, random_state=0)
    scores = []
    trial.set_system_attr("Val loss of fold", [])  # 用于记录当前折的每个个体最终损失
    for fold_idx, (train_idx, valid_idx) in enumerate(fold.split(range(len(dataset))):
        train_data = torch.utils.data.Subset(dataset, train_idx)
        valid_data = torch.utils.data.Subset(dataset, valid_idx)

        train_loader = torch.utils.data.DataLoader(
            train_data,
            batch_size=BATCHSIZE,
            shuffle=True,
        )
        valid_loader = torch.utils.data.DataLoader(
            valid_data,
            batch_size=BATCHSIZE,
            shuffle=True,
        )

        accuracy = objective(trial, train_loader, valid_loader)
        scores.append(accuracy)
        trial.set_system_attr("Val loss of fold", trial.system_attrs["Val loss of fold"] + [accuracy])  # 这里添加了目标值以记录
    return np.mean(scores)

study = optuna.create_study(direction="maximize")
study.optimize(objective_cv, n_trials=20, timeout=600)

[PS不幸的是Optuna的开发人员已经表示他们将在未来删除`system_attrs`,我认为这将是一个损失](2)

[1]: https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial:~:text=to%20be%20optimized.-,system_attrs,-Return%20system%20attributes
[2]: https://Deprecated%20in%20v3.1.0.%20This%20feature%20will%20be%20removed%20in%20the%20future.%20The%20removal%20of%20this%20feature%20is%20currently%20scheduled%20for%20v6.0.0,%20but%20this%20schedule%20is%20subject%20to%20change.%20See%20https://github.com/optuna/optuna/releases/tag/v3.1.0.

<details>
<summary>英文:</summary>

Each splits validation loss can be recorded in the [`system_attrs`][1] of the Trial object of the current trial. The `system_attrs` can be seen in the dashboard under the respective trial as you wished.

The modified code having the desired functionality is:

    def objective(trial, train_loader, valid_loader):
    
        # Remove the following line.
        # train_loader, valid_loader = get_mnist()
    
        ...
    
        return accuracy
    
    
    def objective_cv(trial):
    
        # Get the MNIST dataset.
        dataset = datasets.MNIST(DIR, train=True, download=True, transform=transforms.ToTensor())
    
        fold = KFold(n_splits=3, shuffle=True, random_state=0)
        scores = []
        trial.set_system_attr(&quot;Val loss of fold&quot;,[])   #to record each individual final loss of the current fold
        for fold_idx, (train_idx, valid_idx) in enumerate(fold.split(range(len(dataset)))):
            train_data = torch.utils.data.Subset(dataset, train_idx)
            valid_data = torch.utils.data.Subset(dataset, valid_idx)
    
            train_loader = torch.utils.data.DataLoader(
                train_data,
                batch_size=BATCHSIZE,
                shuffle=True,
            )
            valid_loader = torch.utils.data.DataLoader(
                valid_data,
                batch_size=BATCHSIZE,
                shuffle=True,
            )
    
            accuracy = objective(trial, train_loader, valid_loader)
            scores.append(accuracy)
            trial.set_system_attr(&quot;Val loss of fold&quot;,trial.system_attrs[&quot;Val loss of fold&quot;]+[accuracy]) #here is the objective value is added to the record
        return np.mean(scores)
    
    
    study = optuna.create_study(direction=&quot;maximize&quot;)
    study.optimize(objective_cv, n_trials=20, timeout=600)

[PS: Unfortunatly, Optuna developers have indicated that they will remove the `system_attrs` in the future which I think will be a loss.][2] 


  [1]: https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial:~:text=to%20be%20optimized.-,system_attrs,-Return%20system%20attributes
  [2]: https://Deprecated%20in%20v3.1.0.%20This%20feature%20will%20be%20removed%20in%20the%20future.%20The%20removal%20of%20this%20feature%20is%20currently%20scheduled%20for%20v6.0.0,%20but%20this%20schedule%20is%20subject%20to%20change.%20See%20https://github.com/optuna/optuna/releases/tag/v3.1.0.

</details>



huangapple
  • 本文由 发表于 2023年2月27日 06:14:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/75575325.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定