使用select函数选择数据集中的所有行,除了一行。

huangapple go评论65阅读模式
英文:

Select all rows from a dataset using the select function except one

问题

数据集

autompg = read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data", quote = "\"", comment.char = "", stringsAsFactors = FALSE)

head(autompg, 20)
colnames(autompg) = c("mpg", "cyl", "disp", "hp", "wt", "acc", "year", "origin", "name")

autompg = autompg %>% select()

我不明白select函数内可以使用哪些参数。

autompg$nrow!=name 能起作用吗?

英文:

The Dataset

autompg = read.table( "http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data", quote = "\"", comment.char = "", stringsAsFactors = FALSE)

head(autompg,20)
colnames(autompg) = c("mpg", "cyl", "disp", "hp", "wt", "acc", "year", "origin", "name")

autompg = autompg %>% select()

I don't understand what parameter can be used within the select function.

Would autompg$nrow!=name work?

答案1

得分: 1

dplyr的基本函数(select()filter()mutate()arrange())都相对具体,一般情况下不建议尝试将它们嵌套在彼此之内。

select() 用于选择列,而 filter() 用于筛选行。如果尝试在 select() 中筛选,可能会出现错误。

select() 中使用的参数是您要选择的列。唯一可以在其中执行的其他操作是重命名您选择的列。

以下是一些代码示例来说明应该如何操作:

library(tidyverse)
library(magrittr)

# 导入数据
autompg = read.table("http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data", quote = "\"", comment.char = "", stringsAsFactors = FALSE)

# 更改列名
colnames(autompg) = c("mpg", "cyl", "disp", "hp", "wt", "acc", "year", "origin", "name")

autompg %>% 
  select(name, "miles_per_gallon" = mpg) # 仅选择 name 和 mpg 列,并将 mpg 重命名为 "miles_per_gallon"
  filter(name != "ford pinto") # 过滤掉 "ford pinto"

#                                     name miles_per_gallon
# 1              chevrolet chevelle malibu 18.0
# 2                      buick skylark 320 15.0
# 3                     plymouth satellite 18.0
# 4                          amc rebel sst 16.0
# 等等 ...

希望对您有所帮助!

英文:

The dplyr basic functions (select(), filter(), mutate(), arrange()) are all fairly specific, and generally they don't like it when you try to nest them within one another.

select() is for selecting columns, and filter() is for filtering rows. If you try to filter within a select(), you'll probably get some kind of error.

The parameters you use within select() are the columns you want to select. The only other thing you can do within it is rename the columns you are selecting

Here's some code to illustrate what to do:

library(tidyverse)
library(magrittr)

# import data
autompg = read.table( "http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data", quote = "\"", comment.char = "", stringsAsFactors = FALSE)

# change column names
colnames(autompg) = c("mpg", "cyl", "disp", "hp", "wt", "acc", "year", "origin", "name")

autompg %>% 
  select(name, "miles_per_gallon" = mpg) # selecting only the name and mpg columns, whilst renaming mpg to "miles_per_gallon"
  filter(name != "ford pinto") # filtering out the ford pinto

#                                     name miles_per_gallon
# 1              chevrolet chevelle malibu 18.0
# 2                      buick skylark 320 15.0
# 3                     plymouth satellite 18.0
# 4                          amc rebel sst 16.0
# etc ...

Hope this helps!

huangapple
  • 本文由 发表于 2023年6月13日 12:51:15
  • 转载请务必保留本文链接:https://go.coder-hub.com/76461777.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定