如何在Azure Databricks中搜索以“XYZ”开头的所有列/字段名称。

huangapple go评论52阅读模式
英文:

How do I search for all the columns/field names starting with "XYZ" in Azure Databricks

问题

我想对包含"XYZ"的所有字段/列名进行一次大规模搜索。

我尝试了以下SQL,但出现了错误。

SELECT
    table_name,
    column_name
FROM information_schema.columns
WHERE column_name LIKE '%XYZ%'
ORDER BY table_name, column_name;

错误信息显示:"表或视图未找到:information_schema.columns;第4行,位置5"。

英文:

I would like to do a big search on all field/columns names that contain "XYZ".

I tried below sql but it's giving me an error.

SELECT

table_name

,column_name

FROM information_schema.columns

WHERE column_name like '%account%'

order by table_name, column_name

ERROR states "Table or view not found: information_schema.columns; line 4, pos 5"

答案1

得分: 1

  • information_schema.columns 在 Databricks SQL 中不受支持。没有内置视图可用于获取包括列在内的表的完整详细信息。可以使用 SHOW TABLES(需要指定数据库) 和 SHOW COLUMNS(需要指定表名)。

  • 您可能需要使用 Pyspark 的功能来获取所需的结果。首先使用以下代码获取所有表和它们的列的详细信息:

db_tables = spark.sql(f"SHOW TABLES in default")
from pyspark.sql.functions import *

final_df = None
for row in db_tables.collect():
    if final_df is None:
        final_df = spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}") \
            .withColumn('database', lit(f'{row.database}')) \
            .withColumn('tablename', lit(f'{row.tableName}')) \
            .select('database', 'tablename', 'col_name')
    else:
        final_df = final_df.union(spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}") \
            .withColumn('database', lit(f'{row.database}')) \
            .withColumn('tablename', lit(f'{row.tableName}')) \
            .select('database', 'tablename', 'col_name'))
# display(final_df)

final_df.createOrReplaceTempView('req')
  • 创建一个视图,然后应用以下查询:
%sql

SELECT tablename, col_name FROM req WHERE col_name like '%id%' order by tablename, col_name

【图片链接】:https://i.stack.imgur.com/sn8d6.png

【图片链接】:https://i.stack.imgur.com/0JT8L.png

英文:
  • information_schema.columns is not supported in Databricks SQL. There are no in-built views available to get the complete details of tables along with columns. There is SHOW TABLES (database needs to be given) and SHOW COLUMNS (table name needs to be given).
  • You might have to use Pyspark capabilities to get the required result. First use the following code to get the details of all tables and respective columns:
db_tables = spark.sql(f"SHOW TABLES in default")
from pyspark.sql.functions import *


final_df = None
for row in db_tables.collect():
    if(final_df is None):
        final_df = spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}")\
        .withColumn('database',lit(f'{row.database}'))\
        .withColumn('tablename',lit(f'{row.tableName}'))\
        .select('database','tablename','col_name')
    else:
        final_df = final_df.union(spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}")\
        .withColumn('database',lit(f'{row.database}'))\
        .withColumn('tablename',lit(f'{row.tableName}'))\
        .select('database','tablename','col_name'))
#display(final_df)

final_df.createOrReplaceTempView('req')

如何在Azure Databricks中搜索以“XYZ”开头的所有列/字段名称。

<br>

  • Create a view and then apply the following query:
%sql

SELECT tablename,col_name FROM req WHERE col_name like &#39;%id%&#39; order by tablename, col_name

如何在Azure Databricks中搜索以“XYZ”开头的所有列/字段名称。

huangapple
  • 本文由 发表于 2023年2月14日 02:07:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75439671.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定