2023年2月27日 02:40:51go评论88阅读模式

英文:

How to iterate a function inside a list in which you want to separate the different items

问题

在下面的代码中，我从一个URL下载文件，打开文件，然后有数据的行和列。我想要做的是创建一个循环（或类似的东西）来分离所有的项目。

下载的数据:

['amount,duration,rate,down_payment\n', '100000,36,0.08,20000\n', '200000,12,0.1,\n', '628400,120,0.12,100000\n', '4637400,240,0.06,\n', '42900,90,0.07,8900\n', '916000,16,0.13,\n', '45230,48,0.08,4300\n', '991360,99,0.08,\n', '423000,27,0.09,47200']

代码:

import os
import urllib.request
url1 = 'https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans1.txt'
os.makedirs('./data', exist_ok=True) #创建一个名为'data'的文件夹
urllib.request.urlretrieve(url1, './data/datos1.txt')
with open ('./data/datos1.txt') as file1:
    file1_lines = file1.readlines()
    print("OUR DATA\n")
    print(file1_lines)
#------分离标题-----
    print("\n分离后的标题:\n")
    def titles(header_title):
        return header_title.strip().split(',')
    headers = titles(file1_lines[0])
    print(headers)
#------分离数值-----
    print("\n从文件1分离后的数值:\n")
    def parse_values(data1):
        values = []
        for i in data1.strip().split(','):
            if i == '':
                values.append(0.0)
            else:
                try:
                    values.append(float(i))
                except ValueError:
                    values.append(i)
        return values
    for item in (1, 2, 3):
        new_data = parse_values(file1_lines[item])
        print(new_data)

我尝试创建一个循环，一次处理所有的数据（除了标题），但它将一切都分开成字符，例如 [1234,678] -> [1,2,3,4,6,7,8]。

所以我想创建一个最终的循环，重复使用parse_values函数，但它不起作用，也许是因为file1_lines[]中的括号之间必须是一个数字。但我不知道还能做什么。

英文:

In the next code I download a file from a URL, open the file, and I have rows and columns of data. What I want to do is to create a loop (or something like that) to individualize all the items.

Data downloaded:

[&#39;amount,duration,rate,down_payment\n&#39;, &#39;100000,36,0.08,20000\n&#39;, &#39;200000,12,0.1,\n&#39;, &#39;628400,120,0.12,100000\n&#39;, &#39;4637400,240,0.06,\n&#39;, &#39;42900,90,0.07,8900\n&#39;, &#39;916000,16,0.13,\n&#39;, &#39;45230,48,0.08,4300\n&#39;, &#39;991360,99,0.08,\n&#39;, &#39;423000,27,0.09,47200&#39;]

The code:

import os
import urllib.request
url1 = &#39;https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans1.txt&#39;
os.makedirs(&#39;./data&#39;, exist_ok=True) #Create a folder called &#39;data&#39;
urllib.request.urlretrieve(url1,&#39;./data/datos1.txt&#39;)
with open (&#39;./data/datos1.txt&#39;) as file1:
	file1_lines=file1.readlines()
	print(&quot;OUR DATA\n&quot;)
	print(file1_lines)
#------INDIVIDUALIZED HEADER-----
	print(&quot;\nIndividualized header:\n&quot;)
	def titles(header_title):
		return header_title.strip().split(&#39;,&#39;)
	headers=titles(file1_lines[0])
	print(headers)
#------INDIVIDUALIZED VALUES-----
	print(&quot;\nindividualized values from file1:\n&quot;)
	def parse_values(data1):
		values=[]
		for i in data1.strip().split(&#39;,&#39;):
			if i ==&#39;&#39;:
				values.append(0.0)
			else:
				try:
					values.append(float(i))
				except ValueError:
					values.append(i)
		return values
		for item in (1,2,3):
			new_data=parse_values(file1_lines[item])
			print(new_data)

I tried to make a loop which works with all the data at once (except headers), but it separates everything by characters, like [1234,678]--> [1,2,3,4,6,7,8]

So I thought to create a final loop that repeats the parse_values function, but it does not work, maybe because between the brackets in the file1_lines[] it has to be a number. But I don´t know what else can I do.

答案1

得分: 1

不确定你的代码具体有什么问题，尽管它似乎对这个任务来说有点复杂。

以下代码应该可以完成你想要的任务：

import os
import urllib.request
url1 = 'https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans1.txt'
os.makedirs('./data', exist_ok=True) #创建一个名为'data'的文件夹
urllib.request.urlretrieve(url1,'./data/datos1.txt')
output = []
with open ('./data/datos1.txt') as file1:
    for line in file1.readlines():
        entry = line.strip().split(',')
        output.append(entry);
        
        #如果可能，将entry中的字符串值转换为浮点数
        for i in range(0, len(entry)):
            try:
                entry[i] = float(entry[i])
            except:
                pass

我们只是遍历文件的行，创建一个entry，它只是按','分割的行，最后我们尝试将entry中的每个字符串值转换为浮点数（如果可能）。

将标题行保留在看起来是数据数组的顶部非常奇怪。通常，数据数组只包含一种类型的纯数据。

英文:

Not sure what exactly your code is doing wrong, though it seems over complicated for this task.

The below code should accomplish what you want.

import os
import urllib.request
url1 = &#39;https://gist.githubusercontent.com/aakashns/257f6e6c8719c17d0e498ea287d1a386/raw/7def9ef4234ddf0bc82f855ad67dac8b971852ef/loans1.txt&#39;
os.makedirs(&#39;./data&#39;, exist_ok=True) #Create a folder called &#39;data&#39;
urllib.request.urlretrieve(url1,&#39;./data/datos1.txt&#39;)
output = []
with open (&#39;./data/datos1.txt&#39;) as file1:
    for line in file1.readlines():
        entry = line.strip().split(&#39;,&#39;)
        output.append(entry);
        
        #convert entry values to floats if possible
        for i in range(0, len(entry)):
            try:
                entry[i] = float(entry[i])
            except:
                pass

We're just looping through the lines of the file, creating an entry which is just the line split by ',' and finally we try to convert each string value in the entry to a float if possible.

It's very odd to keep the header row at the top of what appears to be a data array. Usually a data array is just pure data of one type.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何迭代列表中的函数，以便将不同的项分开？

问题

答案1

修复包含字典的Python代码。

如何在Python中使用`globals()`访问属性

separation of training data pyTorch

在discord.py中的正则表达式清除检查？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。