2023年2月8日 20:33:36go评论102阅读模式

英文:

pandas and numby to read csv and convert it from 2d vector to 1d with ignoring diagonal values

问题

我的CSV文件看起来是这样的：
    0  |0.1|0.2|0.4|
    0.1|0  |0.5|0.6|
    0.2|0.5|0  |0.9|
    0.4|0.6|0.9|0  |
我尝试逐行读取它，忽略对角线上的值，并将其写成一个长列，像这样：
    0.1
    0.2
    0.4
    0.1
    0.5
    0.6
    0.2
    0.5
    0.9
    .... 
我使用了这个方法：
    import numpy as np
    import pandas as pd
    
    
    data = pd.read_csv(r"C:\Users\soso-\Desktop\SVM\DataSet\chem_Jacarrd_sim.csv")
    row_vector = np.array(data)
    result = row_vector.ravel()
    result.reshape(299756,1)
    df = pd.DataFrame({'chem':result})
    df.to_csv("my2.csv")
然而，输出忽略了第一行，并且读取了零的值，结果如下：
我该如何修复它？
    0.1
    0
    0.5
    0.6
    0.2
    0.5
    0
    0.9
    ....

英文:

My csv file looks like this:

0  |0.1|0.2|0.4|
0.1|0  |0.5|0.6|
0.2|0.5|0  |0.9|
0.4|0.6|0.9|0  |

I try to read it row by row, ignoring the diagonal values and write it as one long column like this:

0.1
0.2
0.4
0.1
0.5
0.6
0.2
0.5
0.9
....

I use this method:

import numpy as np
import pandas as pd
data = pd.read_csv(r&quot;C:\Users\soso-\Desktop\SVM\DataSet\chem_Jacarrd_sim.csv&quot;)
row_vector = np.array(data)
result = row_vector.ravel()
result.reshape(299756,1)
df = pd.DataFrame({&#39;chem&#39;:result})
df.to_csv(&quot;my2.csv&quot;)

However the output ignores the first row and reads the zero's like follows:
how can I fix it?

0.1
0
0.5
0.6
0.2
0.5
0
0.9
....

答案1

得分: 0

你现在有的数据框为：

0 |0.1|0.2|0.4
0.1|0 |0.5|0.6
0.2|0.5|0 |0.9
0.4|0.6|0.9|0

我将其保存为```ffff.csv```文件，你需要执行以下操作：

import numpy as np
import pandas as pd

data = pd.read_csv("ffff.csv", sep="|", header=None)
print(data)
row_vector = np.array(data)

创建一个具有正确形状的新掩码

mask = np.zeros((row_vector.shape), dtype=bool)
mask[np.arange(row_vector.shape[0]), np.arange(row_vector.shape[0])] = True

result = np.ma.array(row_vector, mask=mask)
result = result.compressed()

df = pd.DataFrame({'chem':result})
df.to_csv("my2.csv", index=False)
print(df)


执行结果为：

chem

0 0.1
1 0.2
2 0.4
3 0.1
4 0.5
5 0.6
6 0.2
7 0.5
8 0.9
9 0.4
10 0.6
11 0.9


<details>
<summary>英文:</summary>
For the datframe you have:

0 |0.1|0.2|0.4
0.1|0 |0.5|0.6
0.2|0.5|0 |0.9
0.4|0.6|0.9|0

which I saved as the ```ffff.csv```df, you need to do the following thing:

import numpy as np
import pandas as pd

data = pd.read_csv("ffff.csv", sep="|", header=None)
print(data)
row_vector = np.array(data)

Create a new mask with the correct shape

mask = np.zeros((row_vector.shape), dtype=bool)
mask[np.arange(row_vector.shape[0]), np.arange(row_vector.shape[0])] = True

result = np.ma.array(row_vector, mask=mask)
result = result.compressed()

df = pd.DataFrame({'chem':result})
df.to_csv("my2.csv", index=False)
print(df)


which returns:

chem

0 0.1
1 0.2
2 0.4
3 0.1
4 0.5
5 0.6
6 0.2
7 0.5
8 0.9
9 0.4
10 0.6
11 0.9


</details>
# 答案2
**得分**: 0
这个代码比较简短，假设你有一个二维的NumPy数组：
```python
import numpy as np
arr = np.random.rand(3,3)
# array([[0.12964821, 0.92124532, 0.72456772],
#        [0.26063188, 0.1486612 , 0.45312145],
#        [0.04165099, 0.31071689, 0.26935581]])
arr_out = arr[np.where(~np.eye(arr.shape[0],dtype=bool))]
# array([0.92124532, 0.72456772, 0.26063188, 0.45312145, 0.04165099,
#        0.31071689])

英文:

This one is a bit shorter

assuming you have 2d numpy array

import numpy as np
arr = np.random.rand(3,3)
# array([[0.12964821, 0.92124532, 0.72456772],
#        [0.26063188, 0.1486612 , 0.45312145],
#        [0.04165099, 0.31071689, 0.26935581]])
arr_out = arr[np.where(~np.eye(arr.shape[0],dtype=bool))]
# array([0.92124532, 0.72456772, 0.26063188, 0.45312145, 0.04165099,
#        0.31071689])

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

pandas and numby to read csv and convert it from 2d vector to 1d with ignoring diagonal values

问题

答案1

创建一个具有正确形状的新掩码

Create a new mask with the correct shape

删除数组中的一行，然后将新行添加到字典中如何做？

0xC00D36C4 错误打开由CV2 VideoWriter生成的MP4视频

Python如何确定省假日的日期

PySpark 创建DataFrame列之间的关系

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。