Transforming annotated csv (influxdb) to normal csv file using python script

huangapple go评论95阅读模式
英文:

Transforming annotated csv (influxdb) to normal csv file using python script

问题

以下是您要翻译的内容:

  1. import csv
  2. # Specify the input and output file names
  3. input_file = 'influx.csv'
  4. output_file = 'output.csv'
  5. try:
  6. # Open the input file for reading
  7. with open(input_file, 'r') as csv_file:
  8. # Create a CSV reader object
  9. csv_reader = csv.reader(csv_file)
  10. # Skip the first row (header)
  11. next(csv_reader)
  12. # Open the output file for writing
  13. with open(output_file, 'w', newline='') as output_csv:
  14. # Create a CSV writer object
  15. csv_writer = csv.writer(output_csv)
  16. # Write the header row
  17. csv_writer.writerow(['_time', '_field', '_value'])
  18. # Iterate over the input file and write the rows to the output file
  19. for row in csv_reader:
  20. # Check if the row is not empty
  21. if row:
  22. # Split the fields
  23. fields = row[0].split(',')
  24. # Write the row to the output file
  25. csv_writer.writerow(fields)
  26. print(f'{input_file} converted to {output_file} successfully!')
  27. except FileNotFoundError:
  28. print(f'Error: File {input_file} not found.')
  29. except Exception as e:
  30. print(f'Error: {e}')

如果您需要任何其他翻译,请告诉我。

英文:

I have a CSV file that was downloaded from InfluxDB UI. I want to extract useful data from the downloaded file. A snippet of the downloaded file is as follows:

  1. #group FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
  2. #datatype string long dateTime:RFC3339 dateTime:RFC3339 dateTime:RFC3339 double string string string string string
  3. #default mean
  4. result table _start _stop _time _value _field _measurement smart_module serial type
  5. 0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:20:00Z 0 sm_alarm system_test 8 2.14301E+11 sm_extended
  6. 0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:40:00Z 0 sm_alarm system_test 8 2.14301E+11 sm_extended
  7. 0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:00:00Z 0 sm_alarm system_test 8 2.14301E+11 sm_extended
  8. 0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 0 sm_alarm system_test 8 2.14301E+11 sm_extended

I'd like to have the output CSV as follows:

  1. _time sm_alarm next_column next_column ....... ...........
  2. 2023-03-29T08:41:15Z 0

Please note that sm_alarm is only one field among 9 others (that are under _filed).

I tried to do with the following script, but could not solve my problem.

  1. import csv
  2. # Specify the input and output file names
  3. input_file = 'influx.csv'
  4. output_file = 'output.csv'
  5. try:
  6. # Open the input file for reading
  7. with open(input_file, 'r') as csv_file:
  8. # Create a CSV reader object
  9. csv_reader = csv.reader(csv_file)
  10. # Skip the first row (header)
  11. next(csv_reader)
  12. # Open the output file for writing
  13. with open(output_file, 'w', newline='') as output_csv:
  14. # Create a CSV writer object
  15. csv_writer = csv.writer(output_csv)
  16. # Write the header row
  17. csv_writer.writerow(['_time', '_field', '_value'])
  18. # Iterate over the input file and write the rows to the output file
  19. for row in csv_reader:
  20. # Check if the row is not empty
  21. if row:
  22. # Split the fields
  23. fields = row[0].split(',')
  24. # Write the row to the output file
  25. csv_writer.writerow(fields)
  26. print(f'{input_file} converted to {output_file} successfully!')
  27. except FileNotFoundError:
  28. print(f'Error: File {input_file} not found.')
  29. except Exception as e:
  30. print(f'Error: {e}')

Thank you.

答案1

得分: 1

以下是翻译好的部分:

  1. import pandas as pd
  2. with open("influx.csv", "r") as csv_file:
  3. headers = csv_file.readlines()[3].strip().split()[1:]
  4. df = pd.read_csv("influx.csv", header=None, skiprows=4, sep="\s+",
  5. engine="python", names=headers).iloc[:, 1:]
  6. #print(df)
  1. print(df)
  2. _start _stop _time _value _field _measurement smart_module serial type
  3. 0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:20:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
  4. 1 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:40:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
  5. 2 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:00:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
  6. 3 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
英文:

The format of your expected output is ambiguous and not fully clear.
But as a starting point, you can straighten your file with read_csv from [tag:pandas] this way :

  1. import pandas as pd
  2. with open("influx.csv", "r") as csv_file:
  3. headers = csv_file.readlines()[3].strip().split()[1:]
  4. df = pd.read_csv("influx.csv", header=None, skiprows=4, sep="\s+",
  5. engine="python", names=headers).iloc[:, 1:]
  6. #df.to_csv("output.csv", index=False, sep=",") # <- uncomment this line to make a real csv

Output :

  1. print(df)
  2. _start _stop _time _value _field _measurement smart_module serial type
  3. 0 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:20:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
  4. 1 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T08:40:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
  5. 2 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:00:00Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended
  6. 3 2023-03-31T08:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 2023-03-31T09:12:40.697076925Z 0 sm_alarm system_test 8 2.143010e+11 sm_extended

If you share a clear expected ouptut, I'll update my answer accordingly.

huangapple
  • 本文由 发表于 2023年3月31日 20:20:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/75898467.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定