Changing the hyperparameters of my classifier SVM trained on a subset of the MNIST digit dataset doesn't change the accuracy at all?

huangapple go评论76阅读模式
英文:

Changing the hyperparameters of my classifier SVM trained on a subset of the MNIST digit dataset doesn't change the accuracy at all?

问题

I'm a beginner when it comes to machine learning and I am attempting to find hyperparameters for an SVM in order to gain 95% accuracy or better for the digit mnist dataset.

The cvs file the code reads from contains 784 attributes and one classifier, where the classifier is the first column of the csv file. The dataset is not, as far as I can see, sorted by classifier.

Below is the code I've written so far. The variables Ctemp and gammatemp are the hyperparameters in question. The accuracy is the testscore variable.

import numpy as np
import os
from sklearn.svm import SVC       

path = os.getcwd()
traindata, testdata = np.loadtxt(path + "\username\data\mnist_train.csv", delimiter=",",max_rows=4000), np.loadtxt(path + "\username\data\mnist_test.csv", delimiter=",",max_rows=1000)

np.random.seed(123)
shuffletrain = np.random.permutation(4000)
np.random.seed(123)
shuffletest = np.random.permutation(1000)
traindata, testdata = traindata[shuffletrain,:], testdata[shuffletest,:]

y_train, y_test = traindata[:2000,0], testdata[:500,0]
X_train, X_test = traindata[:2000,1:], testdata[:500,1:]

Ctemp = 10000
gammatemp = 'auto'
machine = SVC(C=Ctemp, kernel='rbf', gamma=gammatemp)
machine.fit(X_train,y_train)                                   
testscore = machine.score(X_test,y_test)   
print(testscore)

As I train the vector machine with different hyperparameters I get exactly the same testscores. I noticed this when I first tried doing a gridsearch. The testscore does however change if I add a 'skip_rows' argument to the loadtxt function, so different datapoints do result in a different test score.

I tried shuffling the dataset as you can see in my code, still no difference when I change the hyperparameters, leading me to think that this isn't an error with the dataset itself. No matter if C is 1 or 10000, no matter if gamma is 0.1 or 1000. No difference is made.

When I use this exact code for a different dataset it works as expected, but for this dataset the testscore is always 0.194, no matter which hyperparameters I use.

What am I missing? Why is the hyperparameters not affecting the test score?

英文:

I'm a beginner when it comes to machine learning and I am attempting to find hyperparameters for an SVM in order to gain 95% accuracy or better for the digit mnist dataset.

The cvs file the code reads from contains 784 attributes and one classifier, where the classifier is the first column of the csv file. The dataset is not, as far as I can see, sorted by classifier.

Below is the code I've written so far. The variables Ctemp and gammatemp are the hyperparameters in question. The accuracy is the testscore variable.

import numpy as np
import os
from sklearn.svm import SVC       

path = os.getcwd()
traindata, testdata = np.loadtxt(path + "\username\data\\mnist_train.csv", delimiter=",",max_rows=4000), np.loadtxt(path + "\username\data\\mnist_test.csv", delimiter=",",max_rows=1000)

np.random.seed(123)
shuffletrain = np.random.permutation(4000)
np.random.seed(123)
shuffletest = np.random.permutation(1000)
traindata, testdata = traindata[shuffletrain,:], testdata[shuffletest,:]

y_train, y_test = traindata[:2000,0], testdata[:500,0]
X_train, X_test = traindata[:2000,1:], testdata[:500,1:]

Ctemp = 10000
gammatemp = 'auto'
machine = SVC(C=Ctemp, kernel='rbf', gamma=gammatemp)
machine.fit(X_train,y_train)                                   
testscore = machine.score(X_test,y_test)   
print(testscore)

As I train the vector machine with different hyperparameters I get exactly the same testscores. I noticed this when I first tried doing a gridsearch. The testscore does however change if I add a 'skip_rows' argument to the loadtxt function, so different datapoints do result in a different test score.

I tried shuffling the dataset as you can see in my code, still no difference when I change the hyperparameters, leading me to think that this isn't an error with the dataset itself. No matter if C is 1 or 10000, no matter if gamma is 0.1 or 1000. No difference is made.

When I use this exact code for a different dataset it works as expected, but for this dataset the testscore is always 0.194, no matter which hyperparameters I use.

What am I missing? Why is the hyperparameters not affecting the test score?

答案1

得分: 0

总结:

可以尝试更广泛的超参数范围以获得更好(不同)的测试分数。

我没有你的MNIST数据集,所以我使用了kaggle的MNIST,并将其任意分成与你的相同大小的训练和测试集。

关于SVM的一个好的理论事实是,当γ很小且C足够大时,你总是可以有100%的训练准确度(参见这篇论文)。因此,当你不确定代码中是否有一些奇怪的错误时,你可以尝试极端的参数,看看你得到什么。

我尝试了C从10^-8到10^14和γ从10^-14到10^8,我得到了最大测试分数为0.92。

你可以查看我的kaggle笔记本获取完整的代码。

英文:

tl;dr

You can try a wider range of hyper-parameters to get better (different) test scores.


I don't have your MNIST so I used kaggle MNIST and arbitrarily split them into train and test sets, the same size as yours.

A good theoretical fact about SVM is that you can always have 100% training accuracy when gamma is small and C is large enough (see this paper). So when you are not sure if there are some weird bugs in the code, you can try an extreme parameter and see what you get.

I tried C from 10^-8 to 10^14 and gamma 10^-14 to 10^8, I got max test score = 0.92

You can see my kaggle notebook for full code.

The code

log10C_range = list(range(-8, 14, 3))
log10g_range = list(range(-14, 8, 3))
with tqdm(total=len(log10C_range)*len(log10g_range)) as pbar:
    for log10C in log10C_range:
        for log10g in log10g_range:
            C = 10**log10C
            g = 10**log10g
            if (C, g) in scores:
                pbar.set_description(f"C={C:g} g={g:g} score={scores[C,g]:g}")
                pbar.update(1)
                continue
            machine = SVC(C=C, kernel='rbf', gamma=g)
            machine.fit(X_train,y_train)     
            testscore = machine.score(X_test,y_test)
            scores[C, g] = testscore
            pbar.set_description(f"C={C:g} g={g:g} score={testscore:g}")
            pbar.update(1)

Changing the hyperparameters of my classifier SVM trained on a subset of the MNIST digit dataset doesn't change the accuracy at all?

huangapple
  • 本文由 发表于 2023年6月8日 18:10:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/76430790.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定