2023年3月1日 14:09:05go评论80阅读模式

英文:

return the location of non null values

问题

Here are the translated code parts:

我有一个看起来像这样的数据框：

0	1	2	3	4	5

0 NaN NaN 7.0 NaN NaN NaN
1 NaN NaN 9.0 NaN NaN NaN
2 5.0 NaN 3.0 NaN 9.0 NaN
3 NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN 1.0

我试图返回非空值的位置。
例如，7.0 在第一行和第二列 (0-2)。

expected = ["0-2", "1-2", "2-0", "2-2", "2-4", "4-5"]


数据框：
```python
mylist=[[np.nan, np.nan, 7, np.nan, np.nan, np.nan],[np.nan, np.nan, 9, np.nan, np.nan, np.nan],[5, np.nan, 3, np.nan, 9, np.nan],[np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],[np.nan, np.nan,np.nan, np.nan,np.nan, 1]]
df = pd.DataFrame(mylist)

更新：

我在列表中得到了重复的行。例如，34-35 和 35-34 是相同的。

out = ['34-35', '35-34',
 '41-42', '42-41',
 '46-47', '47-46',
 '59-63', '63-59',
 '75-76', '76-75',
 '87-88', '88-87']

我需要去除重复项并获得唯一的值，如下：

expected = ['34-35', '41-42', '46-47', '59-63', '75-76', '87-88']

注意：代码部分已被翻译，不包括问题的回答。

英文:

I have a dataframe that looks like this:

	0	1	2	3	4	5
0	NaN	NaN	7.0	NaN	NaN	NaN
1	NaN	NaN	9.0	NaN	NaN	NaN
2	5.0	NaN	3.0	NaN	9.0	NaN
3	NaN	NaN	NaN	NaN	NaN	NaN
4	NaN	NaN	NaN	NaN	NaN	1.0

I am trying to return the location of non null values.
For e.g. 7.0 is in the first row and second column (0-2)

expected = [&quot;0-2&quot;, &quot;1-2&quot;, &quot;2-0&quot;, &quot;2-2&quot;, &quot;2-4&quot;, &quot;4-5&quot;]

Dataframe:

mylist=[[np.nan, np.nan, 7, np.nan, np.nan, np.nan],[np.nan, np.nan, 9, np.nan, np.nan, np.nan],[5, np.nan, 3, np.nan, 9, np.nan],[np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],[np.nan, np.nan,np.nan, np.nan,np.nan, 1]]
df = pd.DataFrame(mylist)

Update:

I am getting duplicate rows in the list. For e.g. 34-35 is same as 35-34

out = [&#39;34-35&#39;, &#39;35-34&#39;,
 &#39;41-42&#39;, &#39;42-41&#39;,
 &#39;46-47&#39;, &#39;47-46&#39;,
 &#39;59-63&#39;, &#39;63-59&#39;,
 &#39;75-76&#39;, &#39;76-75&#39;,
 &#39;87-88&#39;, &#39;88-87&#39;]

I need to remove the duplicates and get the unique values like:

expected = [&#39;34-35&#39;, &#39;41-42&#39;, &#39;46-47&#39;, &#39;59-63&#39;, &#39;75-76&#39;, &#39;87-88&#39;]

答案1

得分: 4

使用DataFrame.stack和列表推导来完成：

out = [f'{i}-{c}' for i, c in df.stack().index]
print(out)
['0-2', '1-2', '2-0', '2-2', '2-4', '4-5']

或者使用numpy.where来获取索引：

如果列和索引默认为RangeIndex：

ro, co = np.where(df.notna())
out = [f'{i}-{c}' for i, c in zip(ro, co)]
print(out)
['0-2', '1-2', '2-0', '2-2', '2-4', '4-5']

如果不是默认的RangeIndex，可以使用索引：

df = df.rename(index=lambda x: f'i{x}', columns=lambda x: f'c{x}')
print(df)
     c0  c1   c2  c3   c4   c5
i0  NaN NaN  7.0 NaN  NaN  NaN
i1  NaN NaN  9.0 NaN  NaN  NaN
i2  5.0 NaN  3.0 NaN  9.0  NaN
i3  NaN NaN  NaN NaN  NaN  NaN
i4  NaN NaN  NaN NaN  NaN  1.0
ro, co = np.where(df.notna())
out = [f'{i}-{c}' for i, c in zip(df.index[ro], df.columns[co])]
print(out)
['i0-c2', 'i1-c2', 'i2-c0', 'i2-c2', 'i2-c4', 'i4-c5']

编辑：如果需要去除排序后的重复项：

使用pd.unique来去除排序后的重复项：

out = pd.unique(['-'.join(map(str, sorted(x))) for x in df.stack().index]).tolist()
print(out)
['0-2', '1-2', '2-2', '2-4', '4-5']

或者在numpy.where的情况下：

ro, co = np.where(df.notna())
out = pd.unique(['-'.join(map(str, sorted(x)) for x in zip(df.index[ro], df.columns[co]))]).tolist()
print(out)
['0-2', '1-2', '2-2', '2-4', '4-5']

英文:

Use list comprehension with DataFrame.stack:

out = [f&#39;{i}-{c}&#39; for i, c in df.stack().index]
print (out)
[&#39;0-2&#39;, &#39;1-2&#39;, &#39;2-0&#39;, &#39;2-2&#39;, &#39;2-4&#39;, &#39;4-5&#39;]

Or numpy.where for indices;

Solution if columns and index are default RangeIndex:

ro, co = np.where(df.notna())
out = [f&#39;{i}-{c}&#39; for i, c in zip(ro, co)]
print (out)
[&#39;0-2&#39;, &#39;1-2&#39;, &#39;2-0&#39;, &#39;2-2&#39;, &#39;2-4&#39;, &#39;4-5&#39;]

If not, use indexing:

df = df.rename(index = lambda x: f&#39;i{x}&#39;, columns = lambda x: f&#39;c{x}&#39;)
print (df)
     c0  c1   c2  c3   c4   c5
i0  NaN NaN  7.0 NaN  NaN  NaN
i1  NaN NaN  9.0 NaN  NaN  NaN
i2  5.0 NaN  3.0 NaN  9.0  NaN
i3  NaN NaN  NaN NaN  NaN  NaN
i4  NaN NaN  NaN NaN  NaN  1.0
ro, co = np.where(df.notna())
out = [f&#39;{i}-{c}&#39; for i, c in zip(df.index[ro], df.columns[co])]
print (out)
[&#39;i0-c2&#39;, &#39;i1-c2&#39;, &#39;i2-c0&#39;, &#39;i2-c2&#39;, &#39;i2-c4&#39;, &#39;i4-c5&#39;]

EDIT: If need remove sorted duplicates:

out = pd.unique([&#39;-&#39;.join(map(str, sorted(x))) for x in df.stack().index]).tolist()
print (out)
[&#39;0-2&#39;, &#39;1-2&#39;, &#39;2-2&#39;, &#39;2-4&#39;, &#39;4-5&#39;]

ro, co = np.where(df.notna())
out = pd.unique([&#39;-&#39;.join(map(str, sorted(x))) 
                 for x in zip(df.index[ro], df.columns[co])]).tolist()
print (out)
[&#39;0-2&#39;, &#39;1-2&#39;, &#39;2-2&#39;, &#39;2-4&#39;, &#39;4-5&#39;]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

返回非空值的位置。

问题

答案1

Dask map_overlap 以非按时间顺序传递分区。

在Pandas数据框中通过分组行值来计算平均值。

Is there a way to reshape a single index pandas DataFrame into a multi index to adapt to time series?

如何按照id和日期（YYYY-MM-DD）分组生成pandas数据框的时间差列？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。