2020年1月3日 19:11:15go评论103阅读模式

英文:

Extract two columns sorted from CSV

问题

import csv
import operator
with open('bad_voice_ports.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    sortedlist = sorted(readCSV, key=operator.itemgetter(1)) 
    dslam_ani_count = {}
    for row in sortedlist:
        dslam_name = row[1]
        ani = row[4][2::]
        print(dslam_name, ani)
        f = open("bad_voice_portsnew20200103SORTED.csv", "a+")
        f.write(dslam_name + ',' + ani + '\n')
        f.close()
        # Count how many times the same value appears in the 1st column
        if dslam_name in dslam_ani_count:
            dslam_ani_count[dslam_name] += 1
        else:
            dslam_ani_count[dslam_name] = 1
    # Print the count of each Dslam_Name
    for dslam_name, count in dslam_ani_count.items():
        print(f"{dslam_name} appears {count} times in the 1st column")

英文:

I have a large csv file, containing multiple values, in the form

Date,Dslam_Name,Card,Port,Ani,DownStream,UpStream,Status
2020-01-03 07:10:01,aart-m1-m1,204,57,302xxxxxxxxx,0,0,down

I want to extract the Dslam_Name and Ani values, sort them by Dslam_name and write them to a new csv in two different columns.

So far my code is as follows:

import csv
import operator
with open(&#39;bad_voice_ports.csv&#39;) as csvfile:
    readCSV = csv.reader(csvfile, delimiter=&#39;,&#39;)
    sortedlist = sorted(readCSV, key=operator.itemgetter(1)) 
    for row in sortedlist:
        bad_port = row[1][:4],row[4][2::]
        print(bad_port)
        f = open(&quot;bad_voice_portsnew20200103SORTED.csv&quot;,&quot;a+&quot;)
        f.write(row[1][:4] + &quot; &quot; + row[4][2::] + &#39;\n&#39;)
        f.close()

But my Dslam_Name and Ani values are kept in the same column.

As a next step I would like to count how many times the same value appears in the 1st column.

答案1

得分: 1

你正在强制它们成为单列。将它们合并成一个字符串意味着Python不再将它们视为分开的。

但是，尝试改用以下代码：

import csv
import operator
with open('bad_voice_ports.csv') as readfile, open('bad_voice_portsnew20200103SORTED.csv', 'w') as writefile:
    readCSV = csv.reader(readfile)
    writeCSV = csv.writer(writefile)
    for row in sorted(readCSV, key=operator.itemgetter(1)):
        bad_port = row[1][:4], row[4][2::]
        print(bad_port)
        writeCSV.writerow(bad_port)

如果您想要包括每个键出现的次数，您也可以轻松地在程序中包含它。我建议稍微重构以分开读取和写入。

import csv
import operator
from collections import Counter
with open('bad_voice_ports.csv') as readfile:
    readCSV = csv.reader(readfile)
    rows = []
    counts = Counter()
    for row in readCSV:
        rows.append([row[1][:4], row[4][2::]])
        counts[row[1][:4]] += 1
with open('bad_voice_portsnew20200103SORTED.csv', 'w') as writefile:
    writeCSV = csv.writer(writefile)
    for row in sorted(rows):
        print(row)
        writeCSV.writerow([counts[row[0]]] + row)

我建议从CSV文件中完全删除标题行；如果您想要保留它，将第一行丢弃（或分开并添加回去）应该是一个容易的更改。
（另外，硬编码输入和输出文件名存在问题；也许可以让程序从 sys.argv[1:] 中读取它们。）

英文:

You are forcing them to be a single column. Joining the two into a single string means Python no longer regards them as separate.

But try this instead:

import csv
import operator
with open(&#39;bad_voice_ports.csv&#39;) as readfile, open(&#39;bad_voice_portsnew20200103SORTED.csv&#39;, &#39;w&#39;) as writefile:
    readCSV = csv.reader(readfile)
    writeCSV = csv.writer(writefile)
    for row in sorted(readCSV, key=operator.itemgetter(1)):
        bad_port = row[1][:4],row[4][2::]
        print(bad_port)
        writeCSV.writerow(bad_port)

If you want to include the number of times each key occurred, you can easily include that in the program, too. I would refactor slightly to separate the reading and the writing.

import csv
import operator
from collections import Counter
with open(&#39;bad_voice_ports.csv&#39;) as readfile:
    readCSV = csv.reader(readfile)
    rows = []
    counts = Counter()
    for row in readCSV:
        rows.append([row[1][:4], row[4][2::]])
        counts[row[1][:4]] += 1
with open(&#39;bad_voice_portsnew20200103SORTED.csv&#39;, &#39;w&#39;) as writefile:
    writeCSV = csv.writer(writefile)
    for row in sorted(rows):
        print(row)
        writeCSV.writerow([counts[row[0]]] + row)

I would recommend to remove the header line from the CSV file entirely; throwing away (or separating out and prepending back) the first line should be an easy change if you want to keep it.
(Also, hard-coding input and output file names is problematic; maybe have the program read them from sys.argv[1:] instead.)

答案2

得分: 0

我的建议相当简单。正如我在先前的评论中所述，Python 中有关于 CSV 读写的良好文档，您可以在这里找到：https://realpython.com/python-csv/

例如，要从 CSV 中读取所需的列，您可以简单地执行以下操作：

>>> file = open('some.csv', mode='r')
>>> csv_reader = csv.DictReader(file)
>>> for line in csv_reader:
...     print(line["Dslam_Name"] + " " + line["Ani"])
...

这将返回：

aart-m1-m1 302xxxxxxxxx

现在您可以轻松创建一个变量，将列值存储在其中，然后稍后将它们写入文件，或者在读取行的同时打开一个新文件并将列值写入其中。希望这对您有所帮助。

英文:

So my suggestion is failry simple. As i stated in a previous comment there is good documentation on CSV read and write in python here: https://realpython.com/python-csv/

As per an example, to read from a csv the columns you need you can simply do this:

&gt;&gt;&gt; file = open(&#39;some.csv&#39;, mode=&#39;r&#39;)
&gt;&gt;&gt; csv_reader = csv.DictReader(file)
&gt;&gt;&gt; for line in csv_reader:
...     print(line[&quot;Dslam_Name&quot;] + &quot; &quot; + line[&quot;Ani&quot;])
...

This would return:

aart-m1-m1 302xxxxxxxxx

Now you can just as easilly create a variable and store the column values there and later write them to a file or just open up a new file wile reading lines and writing the column values in there. I hope this helps you.

答案3

得分: 0

在@tripleee和@marxmacher的帮助下，我的最终代码如下：

import csv
import operator
from collections import Counter
with open('bad_voice_ports.csv') as csv_file:
    readCSV = csv.reader(csv_file, delimiter=',')
    sortedlist = sorted(readCSV, key=operator.itemgetter(1))
    line_count = 0
    rows = []
    counts = Counter()
    for row in sortedlist:
        Dslam = row[1][:4]
        Ani = row[4][2:]
        if line_count == 0:
            print(row[1], row[4])
            line_count += 1
        else:
            rows.append([row[1][:4], row[4][2::]])
            counts[row[1][:4]] += 1
            print(Dslam, Ani)
            line_count += 1
    for row in sorted(rows):
        f = open("bad_voice_portsnew202001061917.xls","a+")
        f.write(row[0] + '\t' + row[1] + '\t' + str(counts[row[0]]) + '\n')
        f.close()
    print('Total of Bad ports =', str(line_count-1))

通过这种方式，可以从初始的 CSV 文件中提取所需的值/列，并生成一个新的 xls 文件，其中所需的值存储在不同的列中，同时计算每个键的总值，以及条目的总数。

感谢所有的帮助，请随时提出任何改进建议！

英文:

After the help from @tripleee and @marxmacher my final code is

import csv
import operator
from collections import Counter
with open(&#39;bad_voice_ports.csv&#39;) as csv_file:
    readCSV = csv.reader(csv_file, delimiter=&#39;,&#39;)
    sortedlist = sorted(readCSV, key=operator.itemgetter(1))
    line_count = 0
    rows = []
    counts = Counter()
    for row in sortedlist:
        Dslam = row[1][:4]
        Ani = row[4][2:]
        if line_count == 0:
            print(row[1], row[4])
            line_count += 1
        else:
            rows.append([row[1][:4], row[4][2::]])
            counts[row[1][:4]] += 1
            print(Dslam, Ani)
            line_count += 1
    for row in sorted(rows):
        f = open(&quot;bad_voice_portsnew202001061917.xls&quot;,&quot;a+&quot;)
        f.write(row[0] + &#39;\t&#39; + row[1] + &#39;\t&#39; + str(counts[row[0]]) + &#39;\n&#39;)
        f.close()
    print(&#39;Total of Bad ports =&#39;, str(line_count-1))

As with this way the desired values/columns are extracted from the initial csv file and a new xls file is generated with the desired values stored in different columns and the total values per key are counted, along with the total of entries.

Thanks for all the help, please feel free for any improvement suggestions!

答案4

得分: -1

你可以使用 sorted：

import csv
_h, *data = csv.reader(open('filename.csv'))
with open('new_csv.csv', 'w') as f:
  write = csv.writer(f)
  csv.writerows([_h, *sorted([(i[1], i[4]) for i in data], key=lambda x:x[0])])

英文:

You can use sorted:

import csv
_h, *data = csv.reader(open(&#39;filename.csv&#39;))
with open(&#39;new_csv.csv&#39;, &#39;w&#39;) as f:
  write = csv.writer(f)
  csv.writerows([_h, *sorted([(i[1], i[4]) for i in data], key=lambda x:x[0])])

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从CSV中提取两列并排序。

问题

答案1

答案2

答案3

答案4

尝试读取CSV文件时，但扩展名似乎不可见。

没有合作伙伴API，是否有发送Viber品牌/商务消息的其他方法？

如何在Python中定义自定义可调用类型

Show multiple details in console, but not in textarea. How to show specific lines in textarea

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。