2023年4月4日 13:23:18go评论102阅读模式

英文:

Looking for conditional string-appended values in csv.reader

问题

我知道你要求只翻译代码部分，下面是你提供的代码的翻译：

companyList = {'1000000': 'Vendor1', ...}
with open('Vendor Report.csv', mode='r', encoding='latin1') as file:
    csvreader = csv.reader(file)
    for row in csvreader:
        print(' '.join(row))
        if 'Functional Amount Not Invoiced:' in row:
            ...

请注意，这是你提供的代码的翻译，只包含代码部分，没有其他内容。

英文:

I have a vendor payables aging report I'm trying to automate which is provided as a .csv file exported from a financial system. In the report, a line called 'functional amount not invoiced' is listed, followed by a $xx.xx amount for each vendor on the list. Below is an example of the report output (with numbers changed):

1000000 Vendor1 USD PO Number 1/1/1900
Item1, Description 
100 Each $1.00
INV000000 1/1/1900 000 Each 100 0 $1.00 $24.00
0 0  $24.00
INV000001 1/1/1900 000 Each 50 0 $1.00 $10.50
0 0  $10.50
-------------------
Functional Amount Not Invoiced: $250.00
Amount Not Invoiced Less Returned: $250.00
1000001 Vendor2 USD PO2061994 6/2/2015
Item2, Description 30 Each $38.00
INV000002 7/23/2015 000 Each 9 0 $38.00 $342.00
0 0  $342.00
INV000003 7/23/2015 000 Each 7 0 $38.00 $266.00
0 0  $266.00
-------------------
Functional Amount Not Invoiced: $346,955.00
Amount Not Invoiced Less Returned: $1,245.00

I would like to know how I can parse a .csv file for all instances of 'Functional Amount Not Invoiced' greater than or equal to $10,000.00, and in those cases, take the first two strings and return them (in the case above, I would return 1000000 Vendor1). Here's my code so far:

companyList={&#39;1000000&#39;:&#39;Vendor1&#39;,...}
with open(&#39;Vendor Report.csv&#39;,mode=&#39;r&#39;,encoding=&#39;latin1&#39;) as file:
csvreader=csv.reader(file)
for row in csvreader:
    print(&#39; &#39;.join(row))
    if &#39;Functional Amount Not Invoiced:&#39; in row:
        ...

I've gotten to the ... part, and I know the logic is 'if amount after string is at least $10,000.00, find the vendor ID and vendor name and return them. The goal would be to have a list of all vendors over $10,000.00 appended automatically to a list. My expected output would be as follows:

Vendor ID Vendor Name $346,955.00
...

答案1

得分: 1

以下是代码部分的翻译：

#pip install pandas
import pandas as pd
MIN_AMOUNT = 10000
df = pd.read_fwf("input.csv", header=None)
vendor_vals = df[0].str.extract(r"(\d+) ([a-zA-Z]+\d+)", expand=False).ffill()
fani_vals = (df.pop(0).str.extract(r"Functional Amount Not Invoiced: $(.*)",
                expand=False).replace(",|\.0+": "", regex=True).astype(float))
companyList = (
                df.assign(VENDOR = vendor_vals, FANI = fani_vals).dropna()
                  .loc[lambda df_: df_["FANI"].gt(MIN_AMOUNT)].to_dict("list")
               )

df = pd.read_fwf("input.csv", header=None)
out = (
        df.join(df[0].str.extract(r"(\d+) ([a-zA-Z]+\d+)")
                .rename(columns={0: "VENDOR_ID", 1:"VENDOR_NAME"}).ffill())
          .assign(FANI = lambda df_: df_.pop(0).str.extract(r"Functional Amount Not Invoiced: $(.*)",
                expand=False).replace(",|\.0+": "", regex=True).astype(float))
          .dropna().loc[lambda df_: df_["FANI"].gt(MIN_AMOUNT)].reset_index(drop=True)
       )

希望这些翻译对您有所帮助。

英文:

IIUC, here is one option with [tag:pandas] by using read_fwf and extract :

#pip install pandas
import pandas as pd
MIN_AMOUNT = 10000
df = pd.read_fwf(&quot;input.csv&quot;, header=None)
vendor_vals = df[0].str.extract(r&quot;(\d+) ([a-zA-Z]+\d+)&quot;, expand=False).ffill()
fani_vals = (df.pop(0).str.extract(r&quot;Functional Amount Not Invoiced: $(.*)&quot;,
                expand=False).replace({r&quot;,|\.0+&quot;: &quot;&quot;}, regex=True).astype(float))
companyList = (
                df.assign(VENDOR = vendor_vals, FANI = fani_vals).dropna()
                  .loc[lambda df_: df_[&quot;FANI&quot;].gt(MIN_AMOUNT)].to_dict(&quot;list&quot;)
               )

Output :

&gt;&gt;&gt; print(companyList)
{&#39;VENDOR&#39;: [&#39;1000001 Vendor2&#39;], &#39;FANI&#39;: [346955.0]}

Update :

If you need a dataframe (to make a .csv), use this :

df = pd.read_fwf(&quot;input.csv&quot;, header=None)
out = (
        df.join(df[0].str.extract(r&quot;(\d+) ([a-zA-Z]+\d+)&quot;)
                .rename(columns={0: &quot;VENDOR_ID&quot;, 1:&quot;VENDOR_NAME&quot;}).ffill())
          .assign(FANI = lambda df_: df_.pop(0).str.extract(r&quot;Functional Amount Not Invoiced: $(.*)&quot;,
                expand=False).replace({r&quot;,|\.0+&quot;: &quot;&quot;}, regex=True).astype(float))
          .dropna().loc[lambda df_: df_[&quot;FANI&quot;].gt(MIN_AMOUNT)].reset_index(drop=True)
       )

Output :

&gt;&gt;&gt; print(out)
  VENDOR_ID VENDOR_NAME      FANI
0   1000001     Vendor2  346955.0

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在csv.reader中查找条件附加的字符串值

问题

答案1

指定1D数组的括号是否像[[ ]]这样做会对结构造成任何改变吗？

如何将数据从我的HTML Flask传递到我的SQLite3数据库（Python）？

Autoencoder用于对二进制数据集进行降维，以便进行聚类。

解析CSV中的日期时间，分配时区并转换为另一个时区 – Polars Python

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。