2023年7月11日 01:13:45go评论94阅读模式

英文:

Bokeh : unable to display data with month axis

问题

我使用SQL和Bokeh进行数据可视化。以下是代码的翻译部分：

# 工作正常的情况
import psycopg2
from bokeh.plotting import figure, show
import numpy as np
from bokeh.palettes import Category10_10 as palette
import itertools
# 数据库连接
conn = psycopg2.connect(database="data", user="postgres", password="privatepassword", host="localhost", port="5432")
cursor = conn.cursor()
cursor.execute("SELECT date_of_search, job_search, COUNT(*) FROM occurrences GROUP BY date_of_search, job_search ORDER BY date_of_search DESC")
results = cursor.fetchall()
cursor.close()
conn.close()
# 数据准备
dates = [np.datetime64(row[0]) for row in results]
job_searches = [row[1] for row in results]
counts = [row[2] for row in results]
# 图表选项
plot = figure(x_axis_type="datetime", title="occurrences evolution over time",
              x_axis_label="date", y_axis_label="occurrences",
              sizing_mode="stretch_width",
              height=700)
# 尝试颜色调色板
colors = itertools.cycle(palette) 
for job_search in list(set(job_searches)):
    job_dates = [date for date, job, count in zip(dates, job_searches, counts) if job == job_search]
    job_counts = [count for date, job, count in zip(dates, job_searches, counts) if job == job_search]
    plot.line(job_dates, job_counts, line_width=3, legend_label=job_search, color=next(colors))

这是工作正常的情况下的代码部分。

以下是不工作情况下的代码翻译：

# 不工作的情况
import psycopg2
from bokeh.plotting import figure, show
from bokeh.palettes import Category10_10 as palette
import numpy as np
import itertools
# 数据库连接
conn = psycopg2.connect(database="data", user="postgres", password="privatepassword", host="localhost", port="5432")
cursor = conn.cursor()
cursor.execute("""
               SELECT DATE(DATE_TRUNC('month', date_of_search)), job_search, COUNT(*) 
               FROM occurrences
               WHERE DATE_TRUNC('month', date_of_search) != DATE_TRUNC('month', current_date)
               GROUP BY DATE_TRUNC('month', date_of_search), job_search 
               ORDER BY DATE_TRUNC('month', date_of_search) DESC
               """)
results = cursor.fetchall()
cursor.close()
conn.close()
# 数据准备
dates = [np.datetime64(row[0]) for row in results]
job_searches = [row[1] for row in results]
counts = [row[2] for row in results]
# 图表选项
plot = figure(x_axis_type="datetime", title="occurrences evolution over time",
              x_axis_label="date", y_axis_label="occurrences",
              sizing_mode="stretch_width",
              height=700)
# 尝试颜色调色板
colors = itertools.cycle(palette) 
for job_search in list(set(job_searches)):
    job_dates = [date for date, job, count in zip(dates, job_searches, counts) if job == job_search]
    job_counts = [count for date, job, count in zip(dates, job_searches, counts) if job == job_search]
    plot.line(job_dates, job_counts, line_width=3, legend_label=job_search, color=next(colors))

这是不工作的情况下的代码翻译部分。

英文:

I use SQL and Bokeh for data visualization. Here's the context: I need to represent a linechart according to occurrences of words related to job names. The aim is a monthly granularity to follow an evolution.

I first tested a daily granularity with the following code:

# Working case
import psycopg2
from bokeh.plotting import figure, show
import numpy as np
from bokeh.palettes import Category10_10 as palette
import itertools
# Database connexion
conn = psycopg2.connect(database=&quot;data&quot;, user=&quot;postgres&quot;, password=&quot;privatepassword&quot;, host=&quot;localhost&quot;, port=&quot;5432&quot;)
cursor = conn.cursor()
cursor.execute(&quot;SELECT date_of_search, job_search, COUNT(*) FROM occurrences GROUP BY date_of_search, job_search ORDER BY date_of_search DESC&quot;)
results = cursor.fetchall()
cursor.close()
conn.close()
# Data preparation
dates = [np.datetime64(row[0]) for row in results]
job_searches = [row[1] for row in results]
counts = [row[2] for row in results]
# Chart options
plot = figure(x_axis_type=&quot;datetime&quot;, title=&quot;occurrences evolution over time&quot;,
              x_axis_label=&quot;date&quot;, y_axis_label=&quot;occurrences&quot;,
              sizing_mode=&quot;stretch_width&quot;,
              height=700)
# Trying color palette
colors = itertools.cycle(palette) 
for job_search in list(set(job_searches)):
    job_dates = [date for date, job, count in zip(dates, job_searches, counts) if job == job_search]
    job_counts = [count for date, job, count in zip(dates, job_searches, counts) if job == job_search]
    plot.line(job_dates, job_counts, line_width=3, legend_label=job_search, color=next(colors))

The result is perfectly working :

So I tried changing the granularity in my SQL query, but the X-axis is unusable. I tried to change the date extraction in the SQL query in various ways, I tried to convert the month of the X axis via .astype('datetime64[M]') :

job_months = np.unique([date.astype(&#39;datetime64[M]&#39;) for date, job, count in zip(dates, job_searches, counts) if job == job_search])

Unfortunately, nothing works. Here's the (almost identical) non-functional code:

# Not working case
import psycopg2
from bokeh.plotting import figure, show
from bokeh.palettes import Category10_10 as palette
import numpy as np
import itertools
# Database connexion
conn = psycopg2.connect(database=&quot;data&quot;, user=&quot;postgres&quot;, password=&quot;privatepassword&quot;, host=&quot;localhost&quot;, port=&quot;5432&quot;)
cursor = conn.cursor()
cursor.execute(&quot;&quot;&quot;
               SELECT DATE(DATE_TRUNC(&#39;month&#39;, date_of_search)), job_search, COUNT(*) 
               FROM occurrences
               WHERE DATE_TRUNC(&#39;month&#39;, date_of_search) != DATE_TRUNC(&#39;month&#39;, current_date)
               GROUP BY DATE_TRUNC(&#39;month&#39;, date_of_search), job_search 
               ORDER BY DATE_TRUNC(&#39;month&#39;, date_of_search) DESC
                &quot;&quot;&quot;)
results = cursor.fetchall()
cursor.close()
conn.close()
# Data preparation
dates = [np.datetime64(row[0]) for row in results]
job_searches = [row[1] for row in results]
counts = [row[2] for row in results]
   
# Chart options
plot = figure(x_axis_type=&quot;datetime&quot;, title=&quot;occurrences evolution over time&quot;,
              x_axis_label=&quot;date&quot;, y_axis_label=&quot;occurrences&quot;,
              sizing_mode=&quot;stretch_width&quot;,
              height=700)
# Trying color palette
colors = itertools.cycle(palette) 
for job_search in list(set(job_searches)):
    job_dates = [date for date, job, count in zip(dates, job_searches, counts) if job == job_search]
    job_counts = [count for date, job, count in zip(dates, job_searches, counts) if job == job_search]
    plot.line(job_dates, job_counts, line_width=3, legend_label=job_search, color=next(colors))

This is the only result i'm able to have :

Do you have any idea? I'm stuck

The WHERE DATE_TRUNC('month', date_of_search) != DATE_TRUNC('month', current_date) in SQL statement is because I don't want the data of unfinished month (allways the current one).

答案1

得分: 0

OK, 我明白了。

只是因为只有一个月的数据... 我尝试添加一些七月的数据来检查，像这样：

(SELECT DATE(DATE_TRUNC('month', date_of_search)), job_search, COUNT(*) 
 FROM occurrences
 GROUP BY DATE_TRUNC('month', date_of_search), job_search 
 ORDER BY DATE_TRUNC('month', date_of_search) DESC)
UNION
(SELECT DATE('2023-07-01') AS date, '数据分析师' AS job_search, '487' AS count)
UNION
(SELECT DATE('2023-07-01') AS date, '数据工程师' AS job_search, '1202' AS count)

而且它正常工作了...

我的错，也许这个主题将来某天会帮助到别人。

英文:

OK, i figured out.

It's just because there is data for one month... I tried to add some data for July to check like this :

(SELECT DATE(DATE_TRUNC(&#39;month&#39;, date_of_search)), job_search, COUNT(*) 
 FROM occurrences
 GROUP BY DATE_TRUNC(&#39;month&#39;, date_of_search), job_search 
 ORDER BY DATE_TRUNC(&#39;month&#39;, date_of_search) DESC)
UNION
(SELECT DATE(&#39;2023-07-01&#39;) AS date, &#39;Data Analyst&#39; AS job_search, &#39;487&#39; AS count)
UNION
(SELECT DATE(&#39;2023-07-01&#39;) AS date, &#39;Data Engineer&#39; AS job_search, &#39;1202&#39; AS count)

And it's working...

My bad, maybe the topic will help someone someday.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Bokeh：无法显示具有月份轴的数据

问题

答案1

Concatenating similar items in a list – Python

将多个JSON文件追加到单个CSV文件

Python-Selenium: 如何切换到位于shadow DOM内部的 ‘switch_to.active_element’ 输入元素？

指定在pyproject.toml中的文件未被Hatchling包括。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。