“WITH x AS” 在 DataBricks 笔记本中引发了 ParseException 错误。

huangapple go评论66阅读模式
英文:

"WITH x AS " ParseException Error in DataBricks notebook

问题

在DataBricks Notebook中,我试图将两个表进行JOIN操作。 SQL语句中的第一行出现错误。

我无法确定原因。我阅读过的文档说这通常是由于拼写错误引起的。但在我这里并非如此(至少在我看来不是这样)。

SQL语句中的错误ParseException
在输入“WITH mgt”处没有可行的替代项(第1行,位置8

== SQL ==
WITH xxx AS(
--------^^^
英文:

I'm trying to JOIN two tables in a DataBricks Notebook. The first line in the SQL statement is erroring-out.

I can't determine why. The docs I've read say its typically due to a typo. But that is not the case for me (at least not that I can see).

Error in SQL statement: ParseException: 
no viable alternative at input 'WITH mgt '(line 1, pos 8)

== SQL ==
WITH xxx AS(
--------^^^

答案1

得分: 1

这可能是由看起来像普通空格(0x20)但实际上不同的字符引起的。Unicode 包含很多这样的字符,尤其在复制粘贴时会发生一些奇怪的格式损坏 SQL 查询字符串的情况。

例如:

val sql = "WITH mgt AS(select 1) select * from mgt"
spark.sql(sql)

org.apache.spark.sql.catalyst.parser.ParseException:
no viable alternative at input 'WITH mgt '(line 1, pos 8)

== SQL ==
WITH mgt AS(select 1) select * from mgt
--------^^^

为什么会发生这种情况?我们可以通过查看 SQL 的确切字节表示来发现原因:

sql.getBytes("utf-8").map("%02X".format(_)).mkString
57495448206D6774E2808041532873656C6563742031292073656C656374202A2066726F6D206D6774
                ^^^^^^

标记的序列是 0xE28080 - 这是En Quad,而不是空格。您可以退格然后重新输入以修复它。

英文:

This can be caused by characters that look like ordinary space (0x20), but are something different. Unicode has quite a lot of them and it happens, especially on copy-paste, that some weird formatting corrupts SQL query string.

For example:

val sql = "WITH mgt AS(select 1) select * from mgt"
spark.sql(sql)

org.apache.spark.sql.catalyst.parser.ParseException:
no viable alternative at input 'WITH mgt '(line 1, pos 8)

== SQL ==
WITH mgt AS(select 1) select * from mgt
--------^^^

Why did that occur? We can discover by looking at exact byte representation of SQL:

sql.getBytes("utf-8").map("%02X".format(_)).mkString
57495448206D6774E2808041532873656C6563742031292073656C656374202A2066726F6D206D6774
                ^^^^^^

Marked sequence is 0xE28080 - which is En Quad, not a space. You can backspace it and type again to fix it.

huangapple
  • 本文由 发表于 2023年6月2日 05:56:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76385981.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定