英文:
How should I learn ML libraries like Sci-kit learn
问题
我几天前开始学习机器学习。我一直在跟着Udemy的课程。但我实际上并不感到完全舒适地盲目跟随教程。例如,在数据预处理部分将数据集拆分为训练集和测试集时,他们使用了来自sci-kit learn库的一个名为train_test_split的函数。
但当我查看他们的文档时,我找不到这样的函数。然后在视频中,他们说它在model_selection模块下。但我也找不到那个函数。也许如果我逐行浏览该模块的所有内容,我会找到它。但这也有点不可能浏览所有内容。所以,我的问题是,这样学习正常吗?还是我做错了什么?
英文:
I have started learning ML few days ago. I've been following a Udemy course. But I'm not actually feeling comfortable following the tutorial blindly. For example, in data preprocessing part while splitting the dataset into train and test set, they were using a function called train_test_split from sci-kit learn library.
But when I went into their documentation, I couldn't find such function. Then in the video they said it was under model_selection module. But I couldn't find the function there as well. Maybe I'll find it if I go through every single line of that module. But that's also kinda not possible to go through all of those. So, my question is, is it normal or okay to learn like this? or am I doing something wrong?
答案1
得分: 1
我的建议是直接使用 scikit-learn 文档。在那里你可以找到相关函数的示例。
例如,对于 train_test_split
方法,可以在 这里 找到方法的签名、参数和输出的详细说明。
方法签名:
sklearn.model_selection.train_test_split(*arrays, test_size=None, train_size=None, random_state=None, shuffle=True, stratify=None)
在同一页中有一个 train_test_split
使用的基本示例:
import numpy as np
from sklearn.model_selection import train_test_split
# toy dataset
X, y = np.arange(10).reshape((5, 2)), range(5)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
# with shuffle
X_train, X_test, y_train, y_test = train_test_split(y, shuffle=False)
同一页的底部还有很多示例,展示了 train_test_split
的用法,通常用于玩具问题。
英文:
My suggestion is to use directly the scikit-learn documentation. In there you can find also example in which the related functions are used.
For example, for the train_test_split
method, found here , there is the method signature, arguments and output explained very well.
Method signature:
sklearn.model_selection.train_test_split(*arrays, test_size=None, train_size=None, random_state=None, shuffle=True, stratify=None)
In the same page there is a basic example of the train_test_split
usage:
import numpy as np
from sklearn.model_selection import train_test_split
# toy dataset
X, y = np.arange(10).reshape((5, 2)), range(5)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
# with shuffle
X_train, X_test, y_train, y_test = train_test_split(y, shuffle=False)
Also in the bottom part of the page you can find a lot of examples in which the train_test_split
is used, usually for toy problems
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论