2023年4月20日 00:43:12go评论69阅读模式

英文:

Adding leading zeros to a column in csv file

问题

我有一个包含3列数据的csv文件。我想要对第三列数据进行格式化，在前面添加零，并将其写入一个新文件。

A	B	C
9	1	33
8	1	82
6	1	5
3	1	481

我希望结果看起来像这样：

A	B	C
9	1	0033
8	1	0082
6	1	0005
3	1	0481

英文:

I have a csv file with 3 columns of data. I want to format the third column by adding leading zeros to it and write it to a new file.

A	B	C
9	1	33
8	1	82
6	1	5
3	1	481

I want it to look like this instead:

A	B	C
9	1	0033
8	1	0082
6	1	0005
3	1	0481

I am fairly new to coding so any help would be greatly appreciated!

答案1

得分: 2

在纯 Python 方法中，我会这样使用 [writer][1] 和 [zfill][2] ：

import csv

with open("file.csv", "r") as in_file:
    reader = csv.reader(in_file)
    header = next(reader)
    rows = [[r[0], r[1], r[2].zfill(4)] for r in reader]

with open("output.csv", "w", newline="") as out_file:
    writer = csv.writer(out_file)
    writer.writerow(header)
    writer.writerows(rows)

使用 [tag:pandas]，您可以尝试以下方式：

#pip install pandas
import pandas as pd

(pd.read_csv("file.csv", sep=",").assign(C=lambda x: x["C"].astype(str).str.zfill(4))
    .to_csv("output.csv", index=False, sep=",")) # <- 调整新的分隔符/分隔符在这里

输出 (*output.csv*)：

A,B,C
9,1,0033
8,1,0082
6,1,0005
3,1,0481


  [1]: https://docs.python.org/3/library/csv.html#csv.writer
  [2]: https://docs.python.org/3/library/stdtypes.html#str.zfill

<details>
<summary>英文:</summary>

In a pure *Python* approach, I would use [`writer`][1] &amp; [`zfill`][2] this way :

    import csv

    with open(&quot;file.csv&quot;, &quot;r&quot;) as in_file:
        reader = csv.reader(in_file)
        header = next(reader)
        rows = [[r[0], r[1], r[2].zfill(4)] for r in reader]
    
    with open(&quot;output.csv&quot;, &quot;w&quot;, newline=&quot;&quot;) as out_file:
        writer = csv.writer(out_file)
        writer.writerow(header)
        writer.writerows(rows)

With [tag:pandas], you can try this :

    #pip install pandas
    import pandas as pd
    
    (pd.read_csv(&quot;file.csv&quot;, sep=&quot;,&quot;).assign(C=lambda x: x[&quot;C&quot;].astype(str).str.zfill(4))
        .to_csv(&quot;output.csv&quot;, index=False, sep=&quot;,&quot;)) # &lt;- adjust the new sep/delimiter here

Output (*output.csv*) :

    A,B,C
    9,1,0033
    8,1,0082
    6,1,0005
    3,1,0481


  [1]: https://docs.python.org/3/library/csv.html#csv.writer
  [2]: https://docs.python.org/3/library/stdtypes.html#str.zfill

</details>



# 答案2
**得分**: 1

这是一个非常朴素的方法，但我希望这有所帮助。让我们从基础开始。

```python
data = {
    "a": [9, 8, 6, 3],
    "b": [1, 1, 1, 1],
    "c": [33, 82, 5, 481]
}
# 将数据加载到DataFrame对象中：
df = pd.DataFrame(data)

要循环遍历列的元素，您可以使用以下代码：

for x in df.c:
    print(x)  # 33,82,5,481

现在，如果您想要向列添加固定数量的零，可以这样做：

创建一个空列表
用带有前导零的所有列值附加列表。
使用列表更新列的值

n = []
for x in df.c:
    n.append("00" + str(x))  # 附加带有前导零的所有列值的列表
print(n)  # ['0033', '0082', '005', '00481']
df.c = n  # 使用列表更新列的值

一个更简洁的方法是使用列表推导式：

n = ["00" + str(x) for x in df.c]
print(n)  # ['0033', '0082', '005', '00481']
df.c = n

但是，根据您的问题，我看到最后一列只有4位数字，我使用了以下小逻辑来相应地获得结果：

n = []
for x in df.c:
    if len(str(x)) == 0:
        n.append("0000" + str(x))
    elif len(str(x)) == 1:
        n.append("000" + str(x))
    elif len(str(x)) == 2:
        n.append("00" + str(x))
    elif len(str(x)) == 3:
        n.append("0" + str(x))
    else:
        n.append(str(x))

print(n)  # ['0033', '0082', '0005', '0481']

df.c = n
print(df)

要将其保存到文件中，您可以使用以下代码：

df.to_csv("new_data.csv")

希望这对您有所帮助！

英文:

This is a very naïve method, but I hope this helps. Let's start from basics

data = {
 &quot;a&quot;:[9,8,6,3],
 &quot;b&quot;:[1,1,1,1],
 &quot;c&quot;:[33,82,5,481]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)

To loop through elements of a column you can use this:

for x in df.c:
    print(x)  #33,82,5,481

Now if you want to add fixed number of zeroes to the column, you can do:
1.Create an empty list
2.Append list with all column values with leading zeroes.
3.Update value of column with the list

n = [] 
for x in df.c:
  n.append(&quot;00&quot;+str(x))#append list with all column values with leading zeroes
print(n)  #[&#39;0033&#39;, &#39;0082&#39;, &#39;005&#39;, &#39;00481&#39;]
df.c = n   #Update value of column with the list

A shorted method would be to use list comprehension

n = [&quot;00&quot;+str(x) for x in df.c]
print(n)  #[&#39;0033&#39;, &#39;0082&#39;, &#39;005&#39;, &#39;00481&#39;]
df.c = n

But since in your question I saw that the last column has only 4 digits, I used this small logic to get results accordingly

n = []
for x in df.c:
    if len(str(x))==0:
        n.append(&quot;0000&quot;+str(x))
    elif len(str(x))==1:
        n.append(&quot;000&quot;+str(x))
    elif len(str(x))==2:
        n.append(&quot;00&quot;+str(x))
    elif len(str(x))==3:
        n.append(&quot;0&quot;+str(x))
    else :
        n.append(str(x))

print(n)  #[&#39;0033&#39;, &#39;0082&#39;, &#39;0005&#39;, &#39;0481&#39;]

df.c = n
print(df)

#	a	b	c
#0	9	1	0033
#1	8	1	0082
#2	6	1	0005
#3	3	1	0481

To save it into a file you can use

df.to_csv(&quot;new_data.csv&quot;)

Hope this helps you!

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在CSV文件中为列添加前导零。

问题

答案1

在一个Azure函数中运行多个Python脚本。

如何将Python字典字符串转换为JSON对象？json.loads()每次都会报双引号错误。

神经网络不匹配预期输出

使用Python创建嵌套的字典或列表，根据提供的非缩进数据。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论