英文:
How do I search for all the columns/field names starting with "XYZ" in Azure Databricks
问题
我想对包含"XYZ"的所有字段/列名进行一次大规模搜索。
我尝试了以下SQL,但出现了错误。
SELECT
table_name,
column_name
FROM information_schema.columns
WHERE column_name LIKE '%XYZ%'
ORDER BY table_name, column_name;
错误信息显示:"表或视图未找到:information_schema.columns;第4行,位置5"。
英文:
I would like to do a big search on all field/columns names that contain "XYZ".
I tried below sql but it's giving me an error.
SELECT
table_name
,column_name
FROM information_schema.columns
WHERE column_name like '%account%'
order by table_name, column_name
ERROR states "Table or view not found: information_schema.columns; line 4, pos 5"
答案1
得分: 1
-
information_schema.columns
在 Databricks SQL 中不受支持。没有内置视图可用于获取包括列在内的表的完整详细信息。可以使用SHOW TABLES
(需要指定数据库) 和SHOW COLUMNS
(需要指定表名)。 -
您可能需要使用 Pyspark 的功能来获取所需的结果。首先使用以下代码获取所有表和它们的列的详细信息:
db_tables = spark.sql(f"SHOW TABLES in default")
from pyspark.sql.functions import *
final_df = None
for row in db_tables.collect():
if final_df is None:
final_df = spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}") \
.withColumn('database', lit(f'{row.database}')) \
.withColumn('tablename', lit(f'{row.tableName}')) \
.select('database', 'tablename', 'col_name')
else:
final_df = final_df.union(spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}") \
.withColumn('database', lit(f'{row.database}')) \
.withColumn('tablename', lit(f'{row.tableName}')) \
.select('database', 'tablename', 'col_name'))
# display(final_df)
final_df.createOrReplaceTempView('req')
- 创建一个视图,然后应用以下查询:
%sql
SELECT tablename, col_name FROM req WHERE col_name like '%id%' order by tablename, col_name
【图片链接】:https://i.stack.imgur.com/sn8d6.png
【图片链接】:https://i.stack.imgur.com/0JT8L.png
英文:
information_schema.columns
is not supported in Databricks SQL. There are no in-built views available to get the complete details of tables along with columns. There isSHOW TABLES
(database needs to be given) andSHOW COLUMNS
(table name needs to be given).- You might have to use Pyspark capabilities to get the required result. First use the following code to get the details of all tables and respective columns:
db_tables = spark.sql(f"SHOW TABLES in default")
from pyspark.sql.functions import *
final_df = None
for row in db_tables.collect():
if(final_df is None):
final_df = spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}")\
.withColumn('database',lit(f'{row.database}'))\
.withColumn('tablename',lit(f'{row.tableName}'))\
.select('database','tablename','col_name')
else:
final_df = final_df.union(spark.sql(f"DESCRIBE TABLE {row.database}.{row.tableName}")\
.withColumn('database',lit(f'{row.database}'))\
.withColumn('tablename',lit(f'{row.tableName}'))\
.select('database','tablename','col_name'))
#display(final_df)
final_df.createOrReplaceTempView('req')
<br>
- Create a view and then apply the following query:
%sql
SELECT tablename,col_name FROM req WHERE col_name like '%id%' order by tablename, col_name
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论