无法解决 ValueError: 无法创建数据集(名称已存在)

huangapple go评论153阅读模式
英文:

I can't resolve ValueError: Unable to create dataset (name already exists)

问题

在第二次交叉验证中,您在保存模型时遇到了错误。以下是部分源代码和错误消息的翻译:

源代码部分:

gc.collect()
sss = StratifiedShuffleSplit(n_splits=10, test_size=0.3, random_state=0)
fold_no = 1
annealer = LearningRateScheduler(lambda x: 1e-3 * 0.9 ** x)
callback2 = CustomEarlyStopping(patience=7)
optimizer = keras.optimizers.Adam(learning_rate=1e-4)
acc_per_fold, loss_per_fold = [], []
needTrain = True

for train_index, test_index in sss.split(X, y):
    clear_session()
    gc.collect()
    model = build_model(X.shape, numClass)
    model.compile(loss='categorical_crossentropy',
                  optimizer=optimizer,
                  metrics=['accuracy'])
    nmModel = 'model_overlap_%d_%d_fold%d.h5' % (n_time_steps, step, fold_no)
    print('------------------------------------------------------------------------')
    print(f'Training for fold {fold_no} ...')
    training_generator = BalancedDataGenerator(X[train_index],
                                               y[train_index],
                                               batch_size=256)

    if needTrain:
        history = model.fit(
            training_generator,
            epochs=1000, callbacks=[
                callback2,
                annealer
            ], verbose=1,
            validation_data=(X[test_index], y[test_index]),
        )
        if os.path.exists(nmModel):
            os.remove(nmModel)
        model.save(nmModel)

    model.load_weights(nmModel)
    scores = model.evaluate(X[test_index], y[test_index], verbose=0)
    print(f'Score for fold {fold_no}: {model.metrics_names[0]} of {scores[0]}; {model.metrics_names[1]} of {scores[1]*100}%')
    acc_per_fold.append(scores[1] * 100)
    loss_per_fold.append(scores[0])
    fold_no = fold_no + 1
    del model
    gc.collect()

错误消息部分:

在保存模型时出现错误。以下是错误消息的翻译:

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
exec(code, globals, locals)

File d:\tuh3salman\trainmodeloverlapseqbuku_all.py:298
model.save(nmModel)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py:67 in error_handler
raise e.with_traceback(filtered_tb) from None

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\h5py\_hl\group.py:183 in create_dataset
dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)

File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\h5py\_hl\dataset.py:163 in make_new_dset
dset_id = h5d.create(parent.id, name, tid, sid, dcpl=dcpl, dapl=dapl)

File h5py\_objects.pyx:54 in h5py._objects.with_phil.wrapper

File h5py\_objects.pyx:55 in h5py._objects.with_phil.wrapper

File h5py\h5d.pyx:138 in h5py.h5d.create

ValueError: 无法创建数据集(名称已存在)

您提到在保存模型时遇到了错误消息 "无法创建数据集(名称已存在)"。这可能是因为您正在尝试保存具有相同名称的模型多次,导致出现这个错误。您可以尝试更改模型的保存名称或确保在保存之前删除旧的模型文件。

另外,您提到了一些库的版本信息,但似乎没有提到与模型保存相关的库(例如h5py)的版本。您可能需要确保使用的库版本兼容,并且在保存模型之前,确保没有其他地方使用相同的模型名称。

希望这能帮助您解决问题。如果问题仍然存在,您可能需要检查模型保存的逻辑以及与模型文件相关的任何其他操作。

英文:

I'm training some data and always got this error in second fold. Here is some of the source code.

gc.collect()
    sss = StratifiedShuffleSplit(n_splits=10, test_size=0.3, random_state=0)
    fold_no = 1
    annealer = LearningRateScheduler(lambda x: 1e-3 * 0.9 ** x)
    callback2 = CustomEarlyStopping(patience=7)#100)                             
    optimizer = keras.optimizers.Adam(learning_rate=1e-4)
    acc_per_fold,loss_per_fold = [],[]
    needTrain=True
    for train_index, test_index in sss.split(X, y):
        # if fold_no > 1:
        clear_session()
        gc.collect()
        model = build_model(
          X.shape,             
          numClass,
          ) 
        model.compile(loss = 'categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])        
        nmModel = 'model_overlap_%d_%d_fold%d.h5'%(n_time_steps,step,fold_no)
        print('------------------------------------------------------------------------')
        print(f'Training for fold {fold_no} ...')
        training_generator = BalancedDataGenerator(X[train_index],
                                            y[train_index],                                                   
                                            batch_size=256)                   

        if needTrain:
            history =  model.fit( 
                training_generator,       
                epochs=1000,callbacks=[
                               callback2,
                                annealer
                               ], verbose=1,                           
                validation_data = (X[test_index],y[test_index]),
                )  
            # model.save(nmModel)     
            if os.path.exists(nmModel):
                os.remove(nmModel)
            model.save(nmModel)
       
        model.load_weights(nmModel)
        scores = model.evaluate(X[test_index],y[test_index], verbose=0)
        print(f'Score for fold {fold_no}: {model.metrics_names[0]} of {scores[0]}; {model.metrics_names[1]} of {scores[1]*100}%')
        acc_per_fold.append(scores[1] * 100)
        loss_per_fold.append(scores[0])    
        # Increase fold number
        fold_no = fold_no + 1
        del model
        gc.collect()

the error occur when the program saving the model. here is the error massage:


  File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec
    exec(code, globals, locals)

  File d:\tuh3salman\trainmodeloverlapseqbuku_all.py:298
    model.save(nmModel)

  File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py:67 in error_handler
    raise e.with_traceback(filtered_tb) from None

  File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\h5py\_hl\group.py:183 in create_dataset
    dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)

  File ~\AppData\Local\Programs\Python\Python310\lib\site-packages\h5py\_hl\dataset.py:163 in make_new_dset
    dset_id = h5d.create(parent.id, name, tid, sid, dcpl=dcpl, dapl=dapl)

  File h5py\_objects.pyx:54 in h5py._objects.with_phil.wrapper

  File h5py\_objects.pyx:55 in h5py._objects.with_phil.wrapper

  File h5py\h5d.pyx:138 in h5py.h5d.create

ValueError: Unable to create dataset (name already exists)

I've tried to downgrade or update some of library, hoping it will help, unfortunately still got the error. I also tried to delete the previous model or move it to another folder. here is some of my library version, maybe it will help to solve this problem: h5py 3.9.0, keras 2.8.0, tensorflow 2.8.0.

I want to solve this error, because I have been searching for the solution for few days but still got the error. It takes 12 hours for one fold to finish, it is wasting time just to find out it will be succeed or not.

答案1

得分: 0

问题出在你保存模型的方式上。尝试使用nmModel.save_weights而不是nmModel.save,这对我有效。

英文:

The problem is the way you save the model. Instead of using nmModel.save, try to use nmModel.save_weight. It works for me

huangapple
  • 本文由 发表于 2023年6月26日 11:05:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/76553293.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定