英文:
Transforming annotated csv (influxdb) to normal csv file using python script
问题
以下是您要翻译的内容:
import csv
# Specify the input and output file names
input_file = 'influx.csv'
output_file = 'output.csv'
try:
# Open the input file for reading
with open(input_file, 'r') as csv_file:
# Create a CSV reader object
csv_reader = csv.reader(csv_file)
# Skip the first row (header)
next(csv_reader)
# Open the output file for writing
with open(output_file, 'w', newline='') as output_csv:
# Create a CSV writer object
csv_writer = csv.writer(output_csv)
# Write the header row
csv_writer.writerow(['_time', '_field', '_value'])
# Iterate over the input file and write the rows to the output file
for row in csv_reader:
# Check if the row is not empty
if row:
# Split the fields
fields = row[0].split(',')
# Write the row to the output file
csv_writer.writerow(fields)
print(f'{input_file} converted to {output_file} successfully!')
except FileNotFoundError:
print(f'Error: File {input_file} not found.')
except Exception as e:
print(f'Error: {e}')
如果您需要任何其他翻译,请告诉我。
英文:
I have a CSV
file that was downloaded from InfluxDB UI
. I want to extract useful data from the downloaded file. A snippet of the downloaded file is as follows:
#group FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
#datatype string long dateTime:RFC3339 dateTime:RFC3339 dateTime:RFC3339 double string string string string string
#default mean
result table _start _stop _time _value _field _measurement smart_module serial type
0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:20:00Z 0 sm_alarm system_test 8 2.14301E+11 sm_extended
0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:40:00Z 0 sm_alarm system_test 8 2.14301E+11 sm_extended
0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:00:00Z 0 sm_alarm system_test 8 2.14301E+11 sm_extended
0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 0 sm_alarm system_test 8 2.14301E+11 sm_extended
I'd like to have the output CSV as follows:
_time sm_alarm next_column next_column ....... ...........
2023-03-29T08:41:15Z 0
Please note that sm_alarm
is only one field among 9 others (that are under _filed
).
I tried to do with the following script, but could not solve my problem.
import csv
# Specify the input and output file names
input_file = 'influx.csv'
output_file = 'output.csv'
try:
# Open the input file for reading
with open(input_file, 'r') as csv_file:
# Create a CSV reader object
csv_reader = csv.reader(csv_file)
# Skip the first row (header)
next(csv_reader)
# Open the output file for writing
with open(output_file, 'w', newline='') as output_csv:
# Create a CSV writer object
csv_writer = csv.writer(output_csv)
# Write the header row
csv_writer.writerow(['_time', '_field', '_value'])
# Iterate over the input file and write the rows to the output file
for row in csv_reader:
# Check if the row is not empty
if row:
# Split the fields
fields = row[0].split(',')
# Write the row to the output file
csv_writer.writerow(fields)
print(f'{input_file} converted to {output_file} successfully!')
except FileNotFoundError:
print(f'Error: File {input_file} not found.')
except Exception as e:
print(f'Error: {e}')
Thank you.
答案1
得分: 1
以下是翻译好的部分:
import pandas as pd
with open("influx.csv", "r") as csv_file:
headers = csv_file.readlines()[3].strip().split()[1:]
df = pd.read_csv("influx.csv", header=None, skiprows=4, sep="\s+",
engine="python", names=headers).iloc[:, 1:]
#print(df)
print(df)
_start _stop _time _value _field _measurement smart_module serial type
0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:20:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
1 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:40:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
2 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:00:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
3 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
英文:
The format of your expected output is ambiguous and not fully clear.
But as a starting point, you can straighten your file with read_csv
from [tag:pandas] this way :
import pandas as pd
with open("influx.csv", "r") as csv_file:
headers = csv_file.readlines()[3].strip().split()[1:]
df = pd.read_csv("influx.csv", header=None, skiprows=4, sep="\s+",
engine="python", names=headers).iloc[:, 1:]
#df.to_csv("output.csv", index=False, sep=",") # <- uncomment this line to make a real csv
Output :
print(df)
_start _stop _time _value _field _measurement smart_module serial type
0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:20:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
1 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:40:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
2 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:00:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
3 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
If you share a clear expected ouptut, I'll update my answer accordingly.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论