英文:
How to convert T-SQL CASE WHEN STATEMENT to Databricks sparkSQL
问题
我正在尝试在SparkSQL中编写一个典型的T-SQL CASE WHEN语句。
T-SQL的CASE WHEN语句如下:
SELECT
*
,CASE
WHEN DatesTestv3.as_of_date = EOMONTH(DATEADD(MONTH, -1, SYSDATETIME())) THEN 'LM'
END AS 'Last Month End'
from dbo.DatesTestv3
我尝试将上述T-SQL转换为SparkSQL,代码如下:
spark.sql("select *, case when as_of_date = last_day(add_months(current_date(),-1)) then 'LM' END AS 'Last Month End' from table")
但是,我遇到了ParseException错误。
我相信这可能是一个简单的语法问题。
因此,有没有关于如何将T-SQL转换为SparkSQL的想法呢?
英文:
I am trying to code a typical T-SQL CASE WHEN statement in sparkSQL.
The T-SQL CASE WHEN statement looks as follows:
SELECT
*
,CASE
WHEN DatesTestv3.as_of_date = EOMONTH(DATEADD(MONTH, -1, SYSDATETIME())) THEN 'LM'
END AS 'Last Month End'
from dbo.DatesTestv3
My attempt at converting the above T-SQL to sparkSQL is as follows:
spark.sql("select *, case when as_of_date = last_day(add_months(current_date(),-1)) then 'LM' END AS 'Last Month End" from table)
However, I'm getting the ParseException error.
I'm sure it's probably a simple syntax issue.
Therefore, any thoughts on how to convert the T-SQL to sparkSQL?
答案1
得分: 1
根据我的知识,我在我的环境中尝试了Pyspark中的相同CASE语句。
我已经创建了以下列的表格。
id,as_of_date,last_month_end
from pyspark.sql.functions import to_date
spark.sql("CREATE TABLE data020 (id INT, as_of_date DATE, last_month_end STRING)")
data = [
(1, "2023-06-01", "MA"),
(2, "2023-06-01", "MA"),
(3, "2023-06-01", "MA"),
(4, "2023-06-01", "MA"),
(5, "2023-06-01", "MA"),
(6, "2023-06-01", "MA"),
(7, "2023-06-01", "MA"),
(8, "2023-06-01", "MA"),
(9, "2023-06-01", "MA"),
(10, "2023-06-01", "MA")
]
df = spark.createDataFrame(data, ["id", "as_of_date", "last_month_end"])
df = df.withColumn("as_of_date", to_date(df.as_of_date))
df.write.mode("overwrite").insertInto("data020")
我尝试的CASE语句如下所示,有两种不同的方式。
from pyspark.sql.functions import last_day, date_sub, current_date, when
data = spark.table("data_dry")
data = data_dry.withColumn("Last Month End", when(data.as_of_date == last_day(date_sub(current_date(), 1)), "LM").otherwise(None))
data_dry.display()
以及
result = spark.sql("""
SELECT *,
CASE
WHEN as_of_date = last_day(date_add(current_date(), -1)) THEN 'LM'
ELSE NULL
END AS `Last Month End`
FROM data_dry
""")
result.display()
你也可以尝试使用Date_trunc
函数。
英文:
As per my knowledge I have tried the Same CASE Statement in the Pyspark at my end.
I have created the table with below columns.
id , as_of_date , last_month_end
from pyspark.sql.functions import to_date
spark.sql("CREATE TABLE data020 (id INT, as_of_date DATE, last_month_end STRING)
data = [
(1, "2023-06-01", "MA"),
(2, "2023-06-01", "MA"),
(3, "2023-06-01", "MA"),
(4, "2023-06-01", "MA"),
(5, "2023-06-01", "MA"),
(6, "2023-06-01", "MA"),
(7, "2023-06-01", "MA"),
(8, "2023-06-01", "MA"),
(9, "2023-06-01", "MA"),
(10, "2023-06-01", "MA")
]
df = spark.createDataFrame(data, ["id", "as_of_date", "last_month_end"])
df = df.withColumn("as_of_date", to_date(df.as_of_date))
df.write.mode("overwrite").insertInto("data020")
The CASE statement I have tried is some thing like this.
I have tried in 2 different ways.
from pyspark.sql.functions import last_day, date_sub, current_date, when
data = spark.table("data_dry")
data = data_dry.withColumn("Last Month End", when(data.as_of_date == last_day(date_sub(current_date(), 1)), "LM").otherwise(None))
data_dry.display()
And
result = spark.sql("""
SELECT *,
CASE
WHEN as_of_date = last_day(date_add(current_date(), -1)) THEN 'LM'
ELSE NULL
END AS Last Month End
FROM data_dry
""")
result.display()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论