将基于行条件设置变量列的值为NaN。

huangapple go评论100阅读模式
英文:

Set variable column values to nan based on row condition

问题

我想要能够根据第一列的值可变地更改一列的值。

假设我有以下数据框:

  1. col_ind col_1 col_2 col_3
  2. 3 a b c
  3. 2 d e f
  4. 1 g h i

我实际上想要执行以下操作:

  1. df.loc[:, df.columns[-df['col_ind']]:] = np.nan

这将导致:

  1. col_ind col_1 col_2 col_3
  2. 3 nan nan nan
  3. 2 d nan nan
  4. 1 g h nan
英文:

I want to be able to variably change a column value based on the value of the first column.

Say I have a dataframe as follows:

  1. col_ind col_1 col_2 col_3
  2. 3 a b c
  3. 2 d e f
  4. 1 g h i

I effectively want to do

  1. df.loc[:, df.columns[-df['col_ind']:]] = np.nan

Which would result in:

  1. col_ind col_1 col_2 col_3
  2. 3 nan nan nan
  3. 2 d nan nan
  4. 1 g h nan

答案1

得分: 5

让我们使用广播来检查可以被掩盖的索引。

  1. c = df.columns[1:]
  2. m = range(len(c), 0, -1) <= df['col_ind'].values[:, None]
  3. df[c] = df[c].mask(m)

结果:

  1. col_ind col_1 col_2 col_3
  2. 0 3 NaN NaN NaN
  3. 1 2 d NaN NaN
  4. 2 1 g h NaN
英文:

Lets use broadcasting to check the indices which can be masked

  1. c = df.columns[1:]
  2. m = range(len(c), 0, -1) &lt;= df[&#39;col_ind&#39;].values[:, None]
  3. df[c] = df[c].mask(m)

Result

  1. col_ind col_1 col_2 col_3
  2. 0 3 NaN NaN NaN
  3. 1 2 d NaN NaN
  4. 2 1 g h NaN

答案2

得分: 1

你可以获取 df["col_ind"]values,对它们进行迭代并将 slice 设置为 np.nan

  1. vals = df["col_ind"].values
  2. for i, v in enumerate(vals):
  3. df.iloc[i, -v:] = np.nan
英文:

You can get the values of df[&quot;col_ind&quot;], iterate through them and set the slice to np.nan:

  1. vals = df[&quot;col_ind&quot;].values
  2. for i, v in enumerate(vals):
  3. df.iloc[i, -v:] = np.nan

答案3

得分: 1

你可以使用apply并指定result_type='broadcast'。 (编辑:借用 @marcelo-paco 的代码)

  1. def make_nan(row):
  2. row[-row[0]:] = np.nan
  3. return row
  4. df = pd.DataFrame({'col_ind': [3, 2, 1], 'col_1': ['a', 'd', 'g'], 'col_2': ['b', 'e', 'h'], 'col_3': ['c', 'f', 'i']})
  5. df[:] = df.apply(make_nan, axis=1, result_type='broadcast')
  6. df

这将得到:

  1. col_ind col_1 col_2 col_3
  2. 0 3 NaN NaN NaN
  3. 1 2 d NaN NaN
  4. 2 1 g h NaN
英文:

You an use apply with result_type=&#39;broadcast&#39;. (Edit: borrowing @marcelo-paco's code)

  1. def make_nan(row):
  2. row[-row[0]:] = np.nan
  3. return row
  4. df = pd.DataFrame({&#39;col_ind&#39;: [3, 2, 1], &#39;col_1&#39;: [&#39;a&#39;, &#39;d&#39;, &#39;g&#39;], &#39;col_2&#39;: [&#39;b&#39;, &#39;e&#39;, &#39;h&#39;], &#39;col_3&#39;: [&#39;c&#39;, &#39;f&#39;, &#39;i&#39;]})
  5. df[:] = df.apply(make_nan, axis=1, result_type=&#39;broadcast&#39;)
  6. df

This will give:

  1. col_ind col_1 col_2 col_3
  2. 3 NaN NaN NaN
  3. 2 d NaN NaN
  4. 1 g h NaN

答案4

得分: 1

你可以使用当前列的切片创建新列,然后替换原始列的内容。

  1. for i, cn in enumerate(df.columns, 1):
  2. df[cn] = [*[np.nan]*i, *df[cn].loc[i:]]

> 将基于行条件设置变量列的值为NaN。

英文:

You could create new columns with slices of the current columns and then replace

  1. for i, cn in enumerate(df.columns,1):
  2. df[cn] = [*[np.nan]*i, *df[cn].loc[i:]]

> 将基于行条件设置变量列的值为NaN。

huangapple
  • 本文由 发表于 2023年3月9日 12:38:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/75680488.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定