如何判断Snowflake中的表列名是否带有引号(”columnname”)创建?

huangapple go评论65阅读模式
英文:

How to tell if table column name was created with quotes ("columnname") in Snowflake?

问题

某些 Snowflake 表中的列是带引号创建的。我需要确定哪些是带引号创建的,哪些不是。

我查看了 INFORMATION_SCHEMA.COLUMNS,但无法从那里区分它们。

我应该在哪里查询以确定这种区别?

CREATE OR REPLACE TABLE TestSchemaTable1 (
    COLUMN1 TEXT,
    "column8" TEXT
);
英文:

Some of the columns in our Snowflake tables were created with quotes. I need to determine which were created with quotes and which ones were not.

I looked at INFORMATION_SCHEMA.COLUMNS, but I cannot tell any difference from here.

Where can I query to determine the difference?

CREATE OR REPLACE TABLE TestSchemaTable1 (
    COLUMN1 TEXT,
    "column8" TEXT
);

答案1

得分: 2

标识符要求

未引用的对象标识符:

  • 以字母 (A-Z, a-z) 或下划线 (“_”) 开头。

  • 只包含字母、下划线、十进制数字 (0-9) 和美元符号 (“$”)。

  • 以大写字符存储和解析 (例如,id 存储和解析为 ID)。

如果您在标识符周围加双引号 (例如,“My identifier with blanks and punctuation.”),则以下规则适用:

  • 在存储和解析标识符时保留标识符的大小写 (例如,“id” 存储和解析为 id)。

  • 标识符可以包含并以ASCII、扩展ASCII和非ASCII字符开头。

可以通过分析查询配置文件 SYSTEM$EXPLAIN_PLAN_JSON 来获取引用信息:

SET table_name = 'TestSchemaTable1';


SELECT REPLACE(s.value::TEXT, '"s~~".') AS column_name
      ,column_name RLIKE '^".*"$' AS is_quoted
FROM (
 SELECT SYSTEM$EXPLAIN_PLAN_JSON('SELECT * FROM '||$table_name||' AS "s~~" LIMIT 0')
) AS cte(c)
,TABLE(FLATTEN(INPUT=>PARSE_JSON(cte.c), PATH=>'Operations[0][0].expressions')) AS s;

对于输入表:

CREATE OR REPLACE TABLE TestSchemaTable1 (
    COLUMN1 TEXT,
    column2 TEXT,
    COLumn3 TEXT,
    "COLumn4" TEXT,
    "column8" TEXT,
    "COLUM-9" TEXT
);

输出:

COLUMN_NAME IS_QUOTED
COLUMN1 FALSE
COLUMN2 FALSE
COLUMN3 FALSE
"COLumn4" TRUE
"column8" TRUE
"COLUM-9" TRUE

编辑方法2:

使用Snowpark/_internal/utils中的正则表达式模式:

SNOWFLAKE_CASE_INSENSITIVE_QUOTED_ID_PATTERN = r'("([A-Z_][A-Z0-9_$]{0,255})")';

完整查询:

SELECT COLUMN_NAME, COLUMN_NAME NOT RLIKE '([A-Z_][A-Z0-9_$]{0,255})' AS IS_QUOTED
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME ILIKE 'TestSchemaTable1'
ORDER BY ORDINAL_POSITION;

/*
COLUMN_NAME IS_QUOTED
COLUMN1 FALSE
COLUMN2 FALSE
COLUMN3 FALSE
COLumn4 TRUE
column8 TRUE
COLUM-9 TRUE
*/
英文:

Identifier Requirements

> Unquoted object identifiers:
>
> - Start with a letter (A-Z, a-z) or an underscore (“_”).
>
> - Contain only letters, underscores, decimal digits (0-9), and dollar signs (“$”).
>
> - Are stored and resolved as uppercase characters (e.g. id is stored and resolved as ID).
>
> If you put double quotes around an identifier (e.g. “My identifier
> with blanks and punctuation.”), the following rules apply:
>
> - The case of the identifier is preserved when storing and resolving the identifier (e.g. "id" is stored and resolved as id).
>
> - The identifier can contain and start with ASCII, extended ASCII, and non-ASCII characters.

It is possible to retrieve information about quoting by analyzing query profile SYSTEM$EXPLAIN_PLAN_JSON:

SET table_name = 'TestSchemaTable1';


SELECT REPLACE(s.value::TEXT, '"s~~".') AS column_name
      ,column_name RLIKE '^".*"$' AS is_quoted
FROM (
 SELECT SYSTEM$EXPLAIN_PLAN_JSON('SELECT * FROM '||$table_name||' AS "s~~" LIMIT 0')
) AS cte(c)
,TABLE(FLATTEN(INPUT=>PARSE_JSON(cte.c), PATH=>'Operations[0][0].expressions')) AS s;

For input table:

CREATE OR REPLACE TABLE TestSchemaTable1 (
    COLUMN1 TEXT,
    column2 TEXT,
    COLumn3 TEXT,
    "COLumn4" TEXT,
    "column8" TEXT,
    "COLUM-9" TEXT
);

Output:

COLUMN_NAME IS_QUOTED
COLUMN1 FALSE
COLUMN2 FALSE
COLUMN3 FALSE
"COLumn4" TRUE
"column8" TRUE
"COLUM-9" TRUE

EDIT Approach 2:

Using the regexp pattern from Snowpark/_internal/utils:

SNOWFLAKE_CASE_INSENSITIVE_QUOTED_ID_PATTERN = r'("([A-Z_][A-Z0-9_$]{0,255})")'

Full query:

SELECT COLUMN_NAME, COLUMN_NAME NOT RLIKE '([A-Z_][A-Z0-9_$]{0,255})' AS IS_QUOTED
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME ILIKE 'TestSchemaTable1'
ORDER BY ORDINAL_POSITION;

/*
COLUMN_NAME	IS_QUOTED
COLUMN1	FALSE
COLUMN2	FALSE
COLUMN3	FALSE
COLumn4	TRUE
column8	TRUE
COLUM-9	TRUE
*/

答案2

得分: 1

以下是您要翻译的内容:

不管您是否使用引号创建列名,都不重要,除非列名不全为大写字母,或者列名包括其他不是下划线或美元符号的符号。Snowflake在没有双引号且双引号中有小写字母的情况下不区分大小写。

如果您使用双引号并且它们全是大写,那么该列名仍将不区分大小写。

我相信您实际上是在询问哪些列名区分大小写,或者是否有非美元符号或下划线符号。

以下是一个示例:

CREATE OR REPLACE TABLE test_names (
    lower_case_no_quotes VARCHAR,
    UPPER_CASE_NO_QUOTES VARCHAR,
    MixedCaseNoQuotes VARCHAR,
    "lower_case_quotes" VARCHAR,
    "UPPER_CASE_QUOTES" VARCHAR,
    "MixedCaseQuotes" VARCHAR,
    "column8" VARCHAR,
    "COLUMN-9" VARCHAR,
    "SYMBOL$#COLUMN" VARCHAR,
    OTHERCOLUMN1234 VARCHAR,
    dollar$ign VARCHAR,
    "WHITE SPACE" VARCHAR
);

要查询该表,您需要在某些列上加引号,但不是所有列:

SELECT 
     lower_case_no_quotes
    ,upper_case_no_quotes
    ,mixedcasenoquotes
    ,"lower_case_quotes" -- 您需要包括引号
    ,upper_case_quotes -- 由于在创建时全为大写,因此不需要引号
    ,"MixedCaseQuotes" -- 您需要包括引号
    ,"column8" -- 您需要包括引号
    ,"COLUMN-9" -- 需要引号
    ,"SYMBOL$#COLUMN" -- 需要引号
    ,othercolumn1234 -- 不需要引号
    ,dollar$ign -- 不需要引号
    ,"WHITE SPACE"
FROM test_names
;

您可以使用以下查询从信息模式中找到所有区分大小写的列的列表:

SELECT *
FROM information_schema.columns
WHERE column_name != UPPER(column_name)
   OR regexp_like(column_name, '.*[^a-zA-Z0-9$_].*')
;

在上面的示例中,您将只看到返回6个列:

  • lower_case_quotes
  • MixedCaseQuotes
  • column8
  • COLUMN-9
  • SYMBOL$#COLUMN
  • WHITE SPACE
英文:

It doesn't really matter if you created column names with quotes or not, UNLESS the column names are not all uppercase OR the column name include other symbols that are not underscores or dollar signs. Snowflake is case-insensitive unless there are double-quotes AND there are lowercase letters in those double-quotes.

If you use double quotes and they are all uppercase then that column name will still be case-insensitive.

I believe you are truly asking, which column names are case-sensitive OR if there are non-dollar sign or underscore symbols.

Here is an example:

CREATE OR REPLACE TABLE test_names (
    lower_case_no_quotes VARCHAR,
    UPPER_CASE_NO_QUOTES VARCHAR,
    MixedCaseNoQuotes VARCHAR,
    "lower_case_quotes" VARCHAR,
    "UPPER_CASE_QUOTES" VARCHAR,
    "MixedCaseQuotes" VARCHAR,
    "column8" VARCHAR,
    "COLUMN-9" VARCHAR,
    "SYMBOL$#COLUMN" VARCHAR,
    OTHERCOLUMN1234 VARCHAR,
    dollar$ign VARCHAR,
    "WHITE SPACE" VARCHAR
);

To be able to query that table you need to include quotes on some of those columns but not all:

SELECT 
     lower_case_no_quotes
    ,upper_case_no_quotes
    ,mixedcasenoquotes
    ,"lower_case_quotes" -- You need to include quotes
    ,upper_case_quotes -- No quotes needed since it was fully uppercase when created
    ,"MixedCaseQuotes" -- You need to include quotes
    ,"column8" -- You need to include quotes
    ,"COLUMN-9" -- Needs Quotes
    ,"SYMBOL$#COLUMN" -- Needs Quotes
    ,othercolumn1234 -- Doesn't need quotes
    ,dollar$ign -- Doesn't need quotes
    ,"WHITE SPACE"
FROM test_names
;

You can find a list of all case-sensitive columns from the information schema using a query like this:

SELECT *
FROM information_schema.columns
WHERE column_name != UPPER(column_name)
   OR regexp_like(column_name, '.*[^a-zA-Z0-9$_].*')
;

In the example above, you would only see 6 columns return:

  • lower_case_quotes
  • MixedCaseQuotes
  • column8
  • COLUMN-9
  • SYMBOL$#COLUMN
  • WHITE SPACE

huangapple
  • 本文由 发表于 2023年4月13日 23:12:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76007090.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定