英文:
problem trying to read a csv file but the extension does not appear
问题
我正在尝试打开来自Spotify数据库的CSV文件,以处理其Sequential Skip Predictions数据集。但是,它的列和行存在一些问题,我无法弄清楚如何修复。我最多只能打开表格,忽略一些问题,但仍然存在很多错误。
链接如下:
我试图访问的文件是:
Training_Set_And_Track_Features_Mini(17.2 MB)
我使用了以下代码,这是我设法打开它的内容:
import pandas as pd
# CSV文件路径
path_file = '/content/drive/MyDrive/TESTE TCC/training mini/16772e7f-7871-4d42-a44f-5f399f40fd94_training_set_track_features_mini'
# 打开CSV
data = pd.read_csv(file_path, encoding='latin1', error_bad_lines=False)
# 显示数据
data
但即便如此,它仍然存在许多错误,列也都很奇怪。
英文:
Well, I'm trying to open a CSV file from the Spotify Database, to work with its Sequential Skip Predictions dataset. But he has some problem with his columns and rows that I can't figure out how to fix. The most I could do was open the table ignoring some problems but it still gets very buggy.
The link is this:
Link for Spotify Sequential Skip Prediction Challange
The file I'm trying to access is this:
Training_Set_And_Track_Features_Mini (17.2 MB)
I used this code here, which is what I managed to do to open it:
import pandas as pd
# path csv
path_file = '/content/drive/MyDrive/TESTE TCC/training mini/16772e7f-7871-4d42-a44f-5f399f40fd94_training_set_track_features_mini'
# Open CSV
data = pd.read_csv(file_path, encoding='latin1', error_bad_lines=False)
# display the data
data
but even so, it's still very buggy, and the columns are all weird
答案1
得分: 2
这是一个tar.gz档案,在使用之前需要解压缩:
为此,请运行以下Shell命令:
tar -xzf 16772e7f-7871-4d42-a44f-5f399f40fd94_training_set_track_features_mini.tar.tar
这将创建一个包含CSV文件的data
文件夹。
然后,从您的Python脚本/笔记本中执行以下操作:
df_features = pd.read_csv('data/track_features/tf_mini.csv')
df_train = pd.read_csv('data/training_set/log_mini.csv')
英文:
This is a tar.gz archive, which you need to extract before using:
For this, run this shell command:
tar -xzf 16772e7f-7871-4d42-a44f-5f399f40fd94_training_set_track_features_mini.tar.tar
This will create a data
folder that contains the CSV files.
Then, from your python script/notebook:
df_features = pd.read_csv('data/track_features/tf_mini.csv')
df_train = pd.read_csv('data/training_set/log_mini.csv')
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论