英文:
How do I search for all the columns/field names starting with "XYZ" in Azure Databricks
问题
我想对包含"XYZ"的所有字段/列名进行一次大规模搜索。
我尝试了以下SQL,但出现了错误。
SELECT
    table_name,
    column_name
FROM information_schema.columns
WHERE column_name LIKE '%XYZ%'
ORDER BY table_name, column_name;
错误信息显示:"表或视图未找到:information_schema.columns;第4行,位置5"。
英文:
I would like to do a big search on all field/columns names that contain "XYZ".
I tried below sql but it's giving me an error.
SELECT
table_name
,column_name
FROM information_schema.columns
WHERE column_name like '%account%'
order by table_name, column_name
ERROR states "Table or view not found: information_schema.columns; line 4, pos 5"
答案1
得分: 1
- 
information_schema.columns在 Databricks SQL 中不受支持。没有内置视图可用于获取包括列在内的表的完整详细信息。可以使用SHOW TABLES(需要指定数据库) 和SHOW COLUMNS(需要指定表名)。 - 
您可能需要使用 Pyspark 的功能来获取所需的结果。首先使用以下代码获取所有表和它们的列的详细信息:
 
db_tables = spark.sql(f"SHOW TABLES in default")
from pyspark.sql.functions import *
final_df = None
for row in db_tables.collect():
    if final_df is None:
        final_df = spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}") \
            .withColumn('database', lit(f'{row.database}')) \
            .withColumn('tablename', lit(f'{row.tableName}')) \
            .select('database', 'tablename', 'col_name')
    else:
        final_df = final_df.union(spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}") \
            .withColumn('database', lit(f'{row.database}')) \
            .withColumn('tablename', lit(f'{row.tableName}')) \
            .select('database', 'tablename', 'col_name'))
# display(final_df)
final_df.createOrReplaceTempView('req')
- 创建一个视图,然后应用以下查询:
 
%sql
SELECT tablename, col_name FROM req WHERE col_name like '%id%' order by tablename, col_name
【图片链接】:https://i.stack.imgur.com/sn8d6.png
【图片链接】:https://i.stack.imgur.com/0JT8L.png
英文:
information_schema.columnsis not supported in Databricks SQL. There are no in-built views available to get the complete details of tables along with columns. There isSHOW TABLES(database needs to be given) andSHOW COLUMNS(table name needs to be given).- You might have to use Pyspark capabilities to get the required result. First use the following code to get the details of all tables and respective columns:
 
db_tables = spark.sql(f"SHOW TABLES in default")
from pyspark.sql.functions import *
final_df = None
for row in db_tables.collect():
    if(final_df is None):
        final_df = spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}")\
        .withColumn('database',lit(f'{row.database}'))\
        .withColumn('tablename',lit(f'{row.tableName}'))\
        .select('database','tablename','col_name')
    else:
        final_df = final_df.union(spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}")\
        .withColumn('database',lit(f'{row.database}'))\
        .withColumn('tablename',lit(f'{row.tableName}'))\
        .select('database','tablename','col_name'))
#display(final_df)
final_df.createOrReplaceTempView('req')

<br>
- Create a view and then apply the following query:
 
%sql
SELECT tablename,col_name FROM req WHERE col_name like '%id%' order by tablename, col_name

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论