如何学习类似Sci-kit Learn的ML库

huangapple go评论60阅读模式
英文:

How should I learn ML libraries like Sci-kit learn

问题

我几天前开始学习机器学习。我一直在跟着Udemy的课程。但我实际上并不感到完全舒适地盲目跟随教程。例如,在数据预处理部分将数据集拆分为训练集和测试集时,他们使用了来自sci-kit learn库的一个名为train_test_split的函数。

但当我查看他们的文档时,我找不到这样的函数。然后在视频中,他们说它在model_selection模块下。但我也找不到那个函数。也许如果我逐行浏览该模块的所有内容,我会找到它。但这也有点不可能浏览所有内容。所以,我的问题是,这样学习正常吗?还是我做错了什么?

英文:

I have started learning ML few days ago. I've been following a Udemy course. But I'm not actually feeling comfortable following the tutorial blindly. For example, in data preprocessing part while splitting the dataset into train and test set, they were using a function called train_test_split from sci-kit learn library.

But when I went into their documentation, I couldn't find such function. Then in the video they said it was under model_selection module. But I couldn't find the function there as well. Maybe I'll find it if I go through every single line of that module. But that's also kinda not possible to go through all of those. So, my question is, is it normal or okay to learn like this? or am I doing something wrong?

答案1

得分: 1

我的建议是直接使用 scikit-learn 文档。在那里你可以找到相关函数的示例。

例如,对于 train_test_split 方法,可以在 这里 找到方法的签名、参数和输出的详细说明。

方法签名:

sklearn.model_selection.train_test_split(*arrays, test_size=None, train_size=None, random_state=None, shuffle=True, stratify=None)

在同一页中有一个 train_test_split 使用的基本示例:

import numpy as np
from sklearn.model_selection import train_test_split
# toy dataset
X, y = np.arange(10).reshape((5, 2)), range(5)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
# with shuffle
X_train, X_test, y_train, y_test = train_test_split(y, shuffle=False)

同一页的底部还有很多示例,展示了 train_test_split 的用法,通常用于玩具问题。

如何学习类似Sci-kit Learn的ML库

英文:

My suggestion is to use directly the scikit-learn documentation. In there you can find also example in which the related functions are used.

For example, for the train_test_split method, found here , there is the method signature, arguments and output explained very well.

Method signature:

sklearn.model_selection.train_test_split(*arrays, test_size=None, train_size=None, random_state=None, shuffle=True, stratify=None)

In the same page there is a basic example of the train_test_split usage:

import numpy as np
from sklearn.model_selection import train_test_split
# toy dataset
X, y = np.arange(10).reshape((5, 2)), range(5)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
# with shuffle
X_train, X_test, y_train, y_test = train_test_split(y, shuffle=False)

Also in the bottom part of the page you can find a lot of examples in which the train_test_split is used, usually for toy problems

如何学习类似Sci-kit Learn的ML库

huangapple
  • 本文由 发表于 2023年4月17日 16:35:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76033183.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定