将事件绘制为单个条形图

huangapple go评论99阅读模式
英文:

Plotting events as a single bar barplot

问题

  1. # Set the figure size for better visualization
  2. plt.figure(figsize=(10, 6))
  3. # Create a horizontal bar chart with start time on the y-axis and duration on the x-axis
  4. plt.barh(df['start'], df['duration'], height=0.5)
  5. # Set the y-axis label to be the start date
  6. plt.ylabel('Start Date')
  7. # Set the x-axis label to be the duration in minutes
  8. plt.xlabel('Duration (minutes)')
  9. # Customize the x-axis ticks and labels for better readability
  10. plt.xticks(range(0, int(df['duration'].max()) + 1, 6))
  11. # Add vertical lines at 6-hour intervals
  12. for i in range(6, int(df['duration'].max()) + 1, 6):
  13. plt.axvline(x=i, color='gray', linestyle='--', linewidth=0.8)
  14. # Show the plot
  15. plt.show()
英文:

I have done some network testing, and have ended up with a CSV-file representing the dropouts in connection on this format:

  1. seqstart,seqend,start date,start time,end date,end time,latency(ms)
  2. 23,58,20/02/2023,17:38:12.524622,20/02/2023,17:38:30.024620,17.499998
  3. 83,144,20/02/2023,17:38:42.524619,20/02/2023,17:39:13.024569,30.49995
  4. 177,187,20/02/2023,17:39:29.524621,20/02/2023,17:39:34.524625,5.000004
  5. 188,217,20/02/2023,17:39:35.024591,20/02/2023,17:39:49.524621,14.50003
  6. 4011,4044,20/02/2023,18:11:26.524624,20/02/2023,18:11:43.024625,16.500001
  7. 4131,4163,20/02/2023,18:12:26.524627,20/02/2023,18:12:42.524625,15.999998
  8. 4191,4223,20/02/2023,18:12:56.524627,20/02/2023,18:13:12.524627,16.0
  9. 4461,4523,20/02/2023,18:15:11.524626,20/02/2023,18:15:42.524626,31.0
  10. 16671,16733,20/02/2023,19:56:56.524634,20/02/2023,19:57:27.524628,30.999994

I want to illustrate this on a format similar to this:
将事件绘制为单个条形图

Together with ChatGPT I managed to get this, with the code:

将事件绘制为单个条形图

  1. import pandas as pd
  2. import matplotlib.pyplot as plt
  3. # Load the CSV file into a pandas DataFrame
  4. df = pd.read_csv('filename.csv')
  5. # Combine start-date and start-time into a single datetime column
  6. df['start'] = pd.to_datetime(df['start date'] + ' ' + df['start time'], format='%d/%m/%Y %H:%M:%S.%f')
  7. # Combine end-date and end-time into a single datetime column
  8. df['end'] = pd.to_datetime(df['end date'] + ' ' + df['end time'], format='%d/%m/%Y %H:%M:%S.%f')
  9. # Calculate the duration of each event in minutes
  10. df['duration'] = (df['end'] - df['start']).dt.total_seconds() / 60
  11. # Sort the events by start time
  12. df = df.sort_values(by='start')
  13. # Create a horizontal bar chart with start time on the y-axis and duration on the x-axis
  14. plt.barh(df['start'], df['duration'], height=0.5)
  15. # Set the y-axis label to be the start date
  16. plt.ylabel('Start Date')
  17. # Set the x-axis label to be the duration in minutes
  18. plt.xlabel('Duration (minutes)')
  19. # Show the plot
  20. plt.show()

How can I alter this code to get the illustration showed at the top? I have tried asking ChatGPT even further, but it did not give me any results

Optionally - Since the data is taken over a period of days. To make it more readable, maybe it could be possible to display 6 or 12 hours on the X-axis, and then make a new line/pipe? Like shown below:

将事件绘制为单个条形图

答案1

得分: 1

这假设您能够重新构建数据框,以便对于每一天,交替包含无丢失和有丢失的持续时间。有比丢失最多的那一天少的天数将用 np.nan 值填充。注意,每行的所有持续时间应该加起来等于 24 小时。

  1. import pandas as pd
  2. import numpy as np
  3. from itertools import cycle
  4. import matplotlib.pyplot as plt
  5. df = pd.DataFrame([
  6. [10, 2, 1, 1, 10],
  7. [10, 3, 11, np.nan, np.nan],
  8. ], index=["day1", "day2"])
  9. color_cycles = cycle(["green", "red"])
  10. colors = [next(color_cycles) for _ in range(len(df.iloc[0]))]
  11. df.plot.barh(stacked=True, color=colors)
  12. plt.legend([])

将事件绘制为单个条形图

英文:

This assumes you manage to rebuild your dataframe to contain for each day alternating the duration without dopouts and the duration with dropouts. The days with fewer dropouts than the day with most dropouts are filled with np.nan values. Note, All durations per row should add up to 24h.

  1. import pandas as pd
  2. import numpy as np
  3. from itertools import cycle
  4. import matplotlib.pyplot as plt
  5. df = pd.DataFrame([
  6. [10, 2, 1, 1, 10],
  7. [10, 3, 11, np.nan, np.nan],
  8. ], index=["day1", "day2"])
  9. color_cycles = cycle(["green", "red"])
  10. colors = [next(colors) for _ in range(len(df[0]))]
  11. df.plot.barh(stacked=True, color=colors)
  12. plt.legend([])

将事件绘制为单个条形图

答案2

得分: 1

以下是代码部分的翻译:

  1. import pandas as pd
  2. from datetime import timedelta
  3. import numpy as np
  4. import matplotlib.pyplot as plt
  5. df = pd.read_csv(
  6. "data.csv",
  7. parse_dates=["start date", "end date"]
  8. )
  9. # 为开始时间添加新列
  10. df["start minute of the day"] = df["start time"].apply(
  11. lambda x: timedelta(
  12. hours=int(x.split(":")[0]),
  13. minutes=int(x.split(":")[1]),
  14. seconds=float(x.split(":")[2])
  15. ).total_seconds() / 60
  16. )
  17. # 为结束时间添加新列
  18. df["end minute of the day"] = df["end time"].apply(
  19. lambda x: timedelta(
  20. hours=int(x.split(":")[0]),
  21. minutes=int(x.split(":")[1]),
  22. seconds=float(x.split(":")[2])
  23. ).total_seconds() / 60
  24. )
  25. # 有序的不同日期列表
  26. days = list(sorted(df["start date"].unique()))
  27. # 创建一个图像
  28. bar_height = 20 # 像素
  29. minute_width = 1 # 像素
  30. color_green = np.array([0, 191, 84])
  31. color_red = np.array([255, 20, 64])
  32. minimum_error_width = 50 # 根据需要减少
  33. img_width = minute_width * 24 * 60 # 一整天
  34. img_height = bar_height * len(days)
  35. img = np.ones((img_height, img_width, 3), np.uint8)
  36. # 填充所有的绿色
  37. img = img * color_green
  38. for img_row_idx, day in enumerate(days):
  39. df_filtered = df[df['start date'] == day]
  40. for _, row in df_filtered.iterrows():
  41. # 如果选择的时间间隔不是分钟,根据需要调整 round
  42. start_index = int(row["start minute of the day"])
  43. end_index = int(row["end minute of the day"])
  44. end_index = min(max(end_index, start_index + minimum_error_width), img_width)
  45. # 填充红色
  46. img[
  47. img_row_idx * bar_height:(img_row_idx + 1) * bar_height,
  48. start_index:end_index
  49. ] = color_red
  50. plt.imshow(img)
  51. plt.yticks(
  52. ticks=np.linspace(bar_height / 2, bar_height * len(days) - bar_height / 2, num=len(days)),
  53. labels=[str(d)[:10] for d in days]
  54. )
  55. plt.xlabel("一天中的分钟")
  56. plt.ylabel("日期")
  57. plt.savefig("output.jpg")

请注意,这是代码的翻译部分,不包括注释。

英文:

This is a version that works with images.

Please note: There is a minimum error width, because your errors are so short in duration, that they will be hardly ever visible. Adjust to your needs. Also: If an error starts at day one and ends at day two, the error is only shown for day one.

将事件绘制为单个条形图

  1. import pandas as pd
  2. from datetime import timedelta
  3. import numpy as np
  4. import matplotlib.pyplot as plt
  5. df = pd.read_csv(
  6. "data.csv",
  7. parse_dates=["start date", "end date"]
  8. )
  9. # add new column for start
  10. df["start minute of the day"] = df["start time"].apply(
  11. lambda x: timedelta(
  12. hours=int(x.split(":")[0]),
  13. minutes=int(x.split(":")[1]),
  14. seconds=float(x.split(":")[2])
  15. ).total_seconds() / 60
  16. )
  17. # add new column for end
  18. df["end minute of the day"] = df["end time"].apply(
  19. lambda x: timedelta(
  20. hours=int(x.split(":")[0]),
  21. minutes=int(x.split(":")[1]),
  22. seconds=float(x.split(":")[2])
  23. ).total_seconds() / 60
  24. )
  25. # list of (ordered) distinct days
  26. days = list(sorted(df["start date"].unique()))
  27. # create an image
  28. bar_height = 20 # px
  29. minute_width = 1 # px
  30. color_green = np.array([0, 191, 84])
  31. color_red = np.array([255, 20, 64])
  32. minimum_error_width = 50 # reduce to your need
  33. img_width = minute_width * 24 * 60 # one full day
  34. img_height = bar_height * len(days)
  35. img = np.ones((img_height, img_width, 3), np.uint8)
  36. # fill all green
  37. img = img * color_green
  38. for img_row_idx, day in enumerate(days):
  39. df_filtered = df[df['start date'] == day]
  40. for _, row in df_filtered.iterrows():
  41. # if you choose other intervals than minutes, adjust round accordingly
  42. start_index = int(row["start minute of the day"])
  43. end_index = int(row["end minute of the day"])
  44. end_index = min(max(end_index, start_index+minimum_error_width), img_width)
  45. # fill red
  46. img[
  47. img_row_idx*bar_height:(img_row_idx+1)*bar_height,
  48. start_index:end_index
  49. ] = color_red
  50. plt.imshow(img)
  51. plt.yticks(
  52. ticks=np.linspace(bar_height/2, bar_height*len(days)-bar_height/2, num=len(days)),
  53. labels=[str(d)[:10] for d in days]
  54. )
  55. plt.xlabel("minute of day")
  56. plt.ylabel("day")
  57. plt.savefig("output.jpg")

huangapple
  • 本文由 发表于 2023年3月9日 21:47:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/75685479.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定