如何在Optuna中记录交叉验证中每个折叠的验证损失?

huangapple go评论140阅读模式
英文:

How to record each fold`s validation loss during cross-validation in Optuna?

问题

抱歉,我只会为您提供代码的翻译。以下是您提供的代码的翻译:

  1. 我正在使用[Toshihiko Yanase](https://stackoverflow.com/a/63242079/21220510)的代码来执行我的超参数优化器的交叉验证使用Optuna以下是我正在使用的代码
  2. def objective(trial, train_loader, valid_loader):
  3. # 移除以下行。
  4. # train_loader, valid_loader = get_mnist()
  5. ...
  6. return accuracy
  7. def objective_cv(trial):
  8. # 获取MNIST数据集。
  9. dataset = datasets.MNIST(DIR, train=True, download=True, transform=transforms.ToTensor())
  10. fold = KFold(n_splits=3, shuffle=True, random_state=0)
  11. scores = []
  12. for fold_idx, (train_idx, valid_idx) in enumerate(fold.split(range(len(dataset))):
  13. train_data = torch.utils.data.Subset(dataset, train_idx)
  14. valid_data = torch.utils.data.Subset(dataset, valid_idx)
  15. train_loader = torch.utils.data.DataLoader(
  16. train_data,
  17. batch_size=BATCHSIZE,
  18. shuffle=True,
  19. )
  20. valid_loader = torch.utils.data.DataLoader(
  21. valid_data,
  22. batch_size=BATCHSIZE,
  23. shuffle=True,
  24. )
  25. accuracy = objective(trial, train_loader, valid_loader)
  26. scores.append(accuracy)
  27. return np.mean(scores)
  28. study = optuna.create_study(direction="maximize")
  29. study.optimize(objective_cv, n_trials=20, timeout=600)

很抱歉,您的问题涉及非代码部分,我无法回答。如果您有任何其他需要翻译的代码,请随时提问。

英文:

I am using Toshihiko Yanase`s code for doing cross validation on my hyperparameter optimizer with Optuna. Here is the code that I am using:

  1. def objective(trial, train_loader, valid_loader):
  2. # Remove the following line.
  3. # train_loader, valid_loader = get_mnist()
  4. ...
  5. return accuracy
  6. def objective_cv(trial):
  7. # Get the MNIST dataset.
  8. dataset = datasets.MNIST(DIR, train=True, download=True, transform=transforms.ToTensor())
  9. fold = KFold(n_splits=3, shuffle=True, random_state=0)
  10. scores = []
  11. for fold_idx, (train_idx, valid_idx) in enumerate(fold.split(range(len(dataset)))):
  12. train_data = torch.utils.data.Subset(dataset, train_idx)
  13. valid_data = torch.utils.data.Subset(dataset, valid_idx)
  14. train_loader = torch.utils.data.DataLoader(
  15. train_data,
  16. batch_size=BATCHSIZE,
  17. shuffle=True,
  18. )
  19. valid_loader = torch.utils.data.DataLoader(
  20. valid_data,
  21. batch_size=BATCHSIZE,
  22. shuffle=True,
  23. )
  24. accuracy = objective(trial, train_loader, valid_loader)
  25. scores.append(accuracy)
  26. return np.mean(scores)
  27. study = optuna.create_study(direction="maximize")
  28. study.optimize(objective_cv, n_trials=20, timeout=600)

Unfortunately, using the code this way, it does not record each folds val loss to the Optuna dashboard. Is there a way to record each folds val loss to the Optuna dashboard?

答案1

得分: 1

每个拆分的验证损失可以记录在当前试验的Trial对象的system_attrs中。system_attrs可以在仪表板中看到,就像您所希望的那样。

具有所需功能的修改后代码如下:

  1. def objective(trial, train_loader, valid_loader):
  2. # 移除以下行。
  3. # train_loader, valid_loader = get_mnist()
  4. ...
  5. return accuracy
  6. def objective_cv(trial):
  7. # 获取MNIST数据集。
  8. dataset = datasets.MNIST(DIR, train=True, download=True, transform=transforms.ToTensor())
  9. fold = KFold(n_splits=3, shuffle=True, random_state=0)
  10. scores = []
  11. trial.set_system_attr("Val loss of fold", []) # 用于记录当前折的每个个体最终损失
  12. for fold_idx, (train_idx, valid_idx) in enumerate(fold.split(range(len(dataset))):
  13. train_data = torch.utils.data.Subset(dataset, train_idx)
  14. valid_data = torch.utils.data.Subset(dataset, valid_idx)
  15. train_loader = torch.utils.data.DataLoader(
  16. train_data,
  17. batch_size=BATCHSIZE,
  18. shuffle=True,
  19. )
  20. valid_loader = torch.utils.data.DataLoader(
  21. valid_data,
  22. batch_size=BATCHSIZE,
  23. shuffle=True,
  24. )
  25. accuracy = objective(trial, train_loader, valid_loader)
  26. scores.append(accuracy)
  27. trial.set_system_attr("Val loss of fold", trial.system_attrs["Val loss of fold"] + [accuracy]) # 这里添加了目标值以记录
  28. return np.mean(scores)
  29. study = optuna.create_study(direction="maximize")
  30. study.optimize(objective_cv, n_trials=20, timeout=600)
  31. [PS不幸的是Optuna的开发人员已经表示他们将在未来删除`system_attrs`我认为这将是一个损失](2)
  32. [1]: https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial:~:text=to%20be%20optimized.-,system_attrs,-Return%20system%20attributes
  33. [2]: https://Deprecated%20in%20v3.1.0.%20This%20feature%20will%20be%20removed%20in%20the%20future.%20The%20removal%20of%20this%20feature%20is%20currently%20scheduled%20for%20v6.0.0,%20but%20this%20schedule%20is%20subject%20to%20change.%20See%20https://github.com/optuna/optuna/releases/tag/v3.1.0.
  34. <details>
  35. <summary>英文:</summary>
  36. Each splits validation loss can be recorded in the [`system_attrs`][1] of the Trial object of the current trial. The `system_attrs` can be seen in the dashboard under the respective trial as you wished.
  37. The modified code having the desired functionality is:
  38. def objective(trial, train_loader, valid_loader):
  39. # Remove the following line.
  40. # train_loader, valid_loader = get_mnist()
  41. ...
  42. return accuracy
  43. def objective_cv(trial):
  44. # Get the MNIST dataset.
  45. dataset = datasets.MNIST(DIR, train=True, download=True, transform=transforms.ToTensor())
  46. fold = KFold(n_splits=3, shuffle=True, random_state=0)
  47. scores = []
  48. trial.set_system_attr(&quot;Val loss of fold&quot;,[]) #to record each individual final loss of the current fold
  49. for fold_idx, (train_idx, valid_idx) in enumerate(fold.split(range(len(dataset)))):
  50. train_data = torch.utils.data.Subset(dataset, train_idx)
  51. valid_data = torch.utils.data.Subset(dataset, valid_idx)
  52. train_loader = torch.utils.data.DataLoader(
  53. train_data,
  54. batch_size=BATCHSIZE,
  55. shuffle=True,
  56. )
  57. valid_loader = torch.utils.data.DataLoader(
  58. valid_data,
  59. batch_size=BATCHSIZE,
  60. shuffle=True,
  61. )
  62. accuracy = objective(trial, train_loader, valid_loader)
  63. scores.append(accuracy)
  64. trial.set_system_attr(&quot;Val loss of fold&quot;,trial.system_attrs[&quot;Val loss of fold&quot;]+[accuracy]) #here is the objective value is added to the record
  65. return np.mean(scores)
  66. study = optuna.create_study(direction=&quot;maximize&quot;)
  67. study.optimize(objective_cv, n_trials=20, timeout=600)
  68. [PS: Unfortunatly, Optuna developers have indicated that they will remove the `system_attrs` in the future which I think will be a loss.][2]
  69. [1]: https://optuna.readthedocs.io/en/stable/reference/generated/optuna.trial.Trial.html#optuna.trial.Trial:~:text=to%20be%20optimized.-,system_attrs,-Return%20system%20attributes
  70. [2]: https://Deprecated%20in%20v3.1.0.%20This%20feature%20will%20be%20removed%20in%20the%20future.%20The%20removal%20of%20this%20feature%20is%20currently%20scheduled%20for%20v6.0.0,%20but%20this%20schedule%20is%20subject%20to%20change.%20See%20https://github.com/optuna/optuna/releases/tag/v3.1.0.
  71. </details>

huangapple
  • 本文由 发表于 2023年2月27日 06:14:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/75575325.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定