如何重新排列2列,使每个唯一对的组合都按第一列在R中排序。

huangapple go评论60阅读模式
英文:

How to rearrange 2 columns so that each combination of unique pairs is ordered by the first column in R

问题

你可以使用R或Python来完成这个任务。以下是R代码示例,用于将数据按你的要求重新排列:

# 创建原始数据框
data <- data.frame(GameNum = c(1,2,3,4,5,6,7,8),
                   Team1 = c("TeamA", "TeamA", "TeamA", "TeamA", 
                             "TeamB", "TeamB", "TeamB", "TeamC"),
                   Team2 = c("TeamB", "TeamC", "TeamD", "TeamD", 
                             "TeamA", "TeamA", "TeamC", "TeamA"),
                   Team1_Win = c(1, 0, 1, 1, 1, 0, 1, 0))

# 创建一个新的数据框来存储重新排列后的数据
new_data <- data.frame(GameNum = integer(0),
                       Team1 = character(0),
                       Team2 = character(0),
                       Team1_Win = integer(0))

# 创建一个队伍列表,用于按顺序处理每个队伍
teams <- unique(c(data$Team1, data$Team2))

# 遍历每个队伍
for (team in teams) {
  # 选择Team1或Team2等于当前队伍的所有行
  team_data <- data[data$Team1 == team | data$Team2 == team, ]
  
  # 将Team1_Win列更新为当前队伍是否是Team1
  team_data$Team1_Win <- ifelse(team_data$Team1 == team, team_data$Team1_Win, 1 - team_data$Team1_Win)
  
  # 将数据添加到新的数据框中
  new_data <- rbind(new_data, team_data)
}

# 重新排序新数据框
new_data <- new_data[order(new_data$GameNum), ]

# 打印结果
print(new_data)

这段代码会按照你的要求重新排列数据,并更新Team1_Win列,然后将结果存储在new_data数据框中。你可以根据需要对new_data进行进一步的操作或分析。

请注意,这只是一种在R中完成任务的方法,你也可以使用Python和pandas库来实现类似的操作。如果你需要Python代码示例,请告诉我,我可以提供给你。

英文:

I have a dataset that looks something like this

data &lt;- data.frame(GameNum = c(1,2,3,4,5,6,7,8),
                   Team1 = c(&quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamA&quot;, 
                             &quot;TeamB&quot;, &quot;TeamB&quot;, &quot;TeamB&quot;, &quot;TeamC&quot;),
                   Team2 = c(&quot;TeamB&quot;, &quot;TeamC&quot;, &quot;TeamD&quot;, &quot;TeamD&quot;, 
                             &quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamC&quot;, &quot;TeamA&quot;),
                   Team1_Win = c(1, 0, 1, 1, 1, 0, 1, 0))
# whether Team1 won or not 

Each row is a different game. Some teams have multiple matches against each other. Sometimes TeamA is in Team1 and sometimes TeamA is Team2. I want to manipulate this data so that I have ALL of TeamA in column Team1 with all of its matches. Then I want to move on to TeamB with ALL of its matches (EXCEPT for TeamA) and so on for TeamC,TeamD etc.

Essentially I would like for it to look like this:

data &lt;- data.frame(GameNum = c(1,2,3,4,5,6,7,8),
                   Team1 = c(&quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamA&quot;, 
                             &quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamB&quot;, &quot;TeamA&quot;),
                   Team2 = c(&quot;TeamB&quot;, &quot;TeamC&quot;, &quot;TeamD&quot;, &quot;TeamD&quot;, 
                             &quot;TeamB&quot;, &quot;TeamB&quot;, &quot;TeamC&quot;, &quot;TeamC&quot;),
                   Team1_Win = c(1, 0, 1, 1, 0, 1, 1, 1))

Where Team1_Win changes when I rearrange the column.

How would I do this in R or Python? SQL?? Even pseudocode is fine. I just don't know where to start. I appreciate any advice. Thank you.

I am not sure how to proceed.

答案1

得分: 0

这里是使用 dplyr 和几个 ifelse() 条件的快速解决方案:

### 您的输入数据框
data &lt;- data.frame(GameNum = c(1,2,3,4,5,6,7,8),
                   Team1 = c(&quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamA&quot;, 
                             &quot;TeamB&quot;, &quot;TeamB&quot;, &quot;TeamB&quot;, &quot;TeamC&quot;),
                   Team2 = c(&quot;TeamB&quot;, &quot;TeamC&quot;, &quot;TeamD&quot;, &quot;TeamD&quot;, 
                             &quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamC&quot;, &quot;TeamA&quot;),
                   Team1_Win = c(1, 0, 1, 1, 1, 0, 1, 0))

### 为 Team1 和 Team2 生成新列,并检查是否需要颠倒顺序

data %&gt;% 
  dplyr::mutate(Team1_new = ifelse(Team1 &gt; Team2, Team2, Team1), 
                Team2_new = ifelse(Team1 &gt; Team2, Team1, Team2)) %&gt;%
  dplyr::mutate(Team1_Win_new = ifelse(Team1 == Team1_new, Team1_Win, !Team1_Win)) %&gt;%
  dplyr::select(GameNum, Team1_new, Team2_new, Team1_Win_new) %&gt;%
  arrange(Team1_new, Team2_new)
英文:

here is a quick solution using dplyr and a few ifelse() conditions:

### Your input dataframe
data &lt;- data.frame(GameNum = c(1,2,3,4,5,6,7,8),
                   Team1 = c(&quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamA&quot;, 
                             &quot;TeamB&quot;, &quot;TeamB&quot;, &quot;TeamB&quot;, &quot;TeamC&quot;),
                   Team2 = c(&quot;TeamB&quot;, &quot;TeamC&quot;, &quot;TeamD&quot;, &quot;TeamD&quot;, 
                             &quot;TeamA&quot;, &quot;TeamA&quot;, &quot;TeamC&quot;, &quot;TeamA&quot;),
                   Team1_Win = c(1, 0, 1, 1, 1, 0, 1, 0))

### Generating new columns for Team1 and Team2 and checking whether the sequence need to be reversed

data %&gt;% 
  dplyr::mutate(Team1_new = ifelse(Team1 &gt; Team2, Team2, Team1), 
                Team2_new = ifelse(Team1 &gt; Team2, Team1, Team2)) %&gt;%
  dplyr::mutate(Team1_Win_new = ifelse(Team1 == Team1_new, Team1_Win, !Team1_Win)) %&gt;%
  dplyr::select(GameNum, Team1_new, Team2_new, Team1_Win_new) %&gt;%
  arrange(Team1_new, Team2_new)

Output:

  GameNum Team1_new Team2_new Team1_Win_new
1       1     TeamA     TeamB             1
2       5     TeamA     TeamB             0
3       6     TeamA     TeamB             1
4       2     TeamA     TeamC             0
5       8     TeamA     TeamC             1
6       3     TeamA     TeamD             1
7       4     TeamA     TeamD             1
8       7     TeamB     TeamC             1

NOTE that in your original database if your sequence is not alphabetical (like here) then you need to convert the team names to factor class with the levels you desire.

huangapple
  • 本文由 发表于 2023年5月26日 11:47:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76337539.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定