2023年6月1日 19:16:06go评论93阅读模式

英文:

problem trying to read a csv file but the extension does not appear

问题

我正在尝试打开来自Spotify数据库的CSV文件，以处理其Sequential Skip Predictions数据集。但是，它的列和行存在一些问题，我无法弄清楚如何修复。我最多只能打开表格，忽略一些问题，但仍然存在很多错误。

链接如下：

Spotify连续跳过预测挑战赛链接

我试图访问的文件是：

Training_Set_And_Track_Features_Mini（17.2 MB）

我使用了以下代码，这是我设法打开它的内容：

import pandas as pd
# CSV文件路径
path_file = '/content/drive/MyDrive/TESTE TCC/training mini/16772e7f-7871-4d42-a44f-5f399f40fd94_training_set_track_features_mini'
# 打开CSV
data = pd.read_csv(file_path, encoding='latin1', error_bad_lines=False)
# 显示数据
data

但即便如此，它仍然存在许多错误，列也都很奇怪。

英文:

Well, I'm trying to open a CSV file from the Spotify Database, to work with its Sequential Skip Predictions dataset. But he has some problem with his columns and rows that I can't figure out how to fix. The most I could do was open the table ignoring some problems but it still gets very buggy.

The link is this:

Link for Spotify Sequential Skip Prediction Challange

The file I'm trying to access is this:

Training_Set_And_Track_Features_Mini (17.2 MB)

I used this code here, which is what I managed to do to open it:

import pandas as pd
# path csv
path_file = &#39;/content/drive/MyDrive/TESTE TCC/training mini/16772e7f-7871-4d42-a44f-5f399f40fd94_training_set_track_features_mini&#39;
# Open CSV
data = pd.read_csv(file_path, encoding=&#39;latin1&#39;, error_bad_lines=False)
# display the data
data

but even so, it's still very buggy, and the columns are all weird

答案1

得分: 2

这是一个tar.gz档案，在使用之前需要解压缩：

为此，请运行以下Shell命令：

tar -xzf 16772e7f-7871-4d42-a44f-5f399f40fd94_training_set_track_features_mini.tar.tar

这将创建一个包含CSV文件的data文件夹。

然后，从您的Python脚本/笔记本中执行以下操作：

df_features = pd.read_csv('data/track_features/tf_mini.csv')
df_train = pd.read_csv('data/training_set/log_mini.csv')

英文:

This is a tar.gz archive, which you need to extract before using:

For this, run this shell command:

tar -xzf 16772e7f-7871-4d42-a44f-5f399f40fd94_training_set_track_features_mini.tar.tar

This will create a data folder that contains the CSV files.

Then, from your python script/notebook:

df_features = pd.read_csv(&#39;data/track_features/tf_mini.csv&#39;)
df_train = pd.read_csv(&#39;data/training_set/log_mini.csv&#39;)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

尝试读取CSV文件时，但扩展名似乎不可见。

问题

答案1

“FastAPI 和 JWT 令牌的令牌验证问题 – ‘无法验证凭据'”

Django：可选外键条目的反向查找并包含在数据集中

如何使一列的值转置到特定值？

Apscheduler在本地Flask上错过了预定任务，延迟2秒。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。