英文:
How to count the columns from tables in Databricks in SQL
问题
以下是翻译好的部分:
"Please, how can I count the number of columns from tables in a SQL notebook in Databricks? The information_schema.columns is not working."
请问,我如何在Databricks的SQL笔记本中计算表格的列数?information_schema.columns无法使用。
"I was trying to run the query below:"
我尝试运行下面的查询:
"Expected Results are:"
期望的结果如下:
英文:
Please, how can I count the number of columns from tables in a SQL notebook in Databricks
? The information_schema.columns
is not working.
I was trying to run the query below:
select 'data_beat' Table_Name
,max(_createTime_) as Last_Create_Date
,max(_upt_) as Last_Update_Date
,**(select count(*)
from information_schema.columns
where Table_name = 'data_beat'
) as Column_count**
,(select count(*)
from data_lineage.field_dependencies
where source like 'ett.data_beat%'
) as Count_to_L1
from ett.data_beat
Expected Results are:
| Table_Name | Last_Update_Date | Last_Create_Date | Columns_Count | Count_to_L1 |
| -------- | -------- |
|ett.data_beat | 5/10/2023 | 5/10/2023 | 51 |12 |
答案1
得分: 0
I tried in my environment and got same error.
根据此文档和我的理解,INFORMATION_SCHEMA
需要创建一个统一的目录,然后它将提供该统一目录下表的详细信息。这可能是它无法识别 INFORMATION_SCHEMA
的原因。
因此,请确保您的表位于一个统一的目录下,以便使用 INFORMATION_SCHEMA
。
另一种解决方法是在 SQL 单元格之间使用一些 Pyspark 代码将列的详细信息存储在临时视图中,并使用它来获取计数。
我的示例:
%python
df=spark.sql('show columns from sample1')
df.createOrReplaceTempView("columns_sample1")
%sql
select count(*) as count from columns_sample1;
英文:
I tried in my environment and got same error.
As per this Documentation, and as far as I understood, INFORMATION_SCHEMA
requires unity catalogue to be created and it will give the details for those tables under that unity catalogue. This might be reason for it to not identifying INFORMATION_SCHEMA
.
So, make sure your tables are under a unity catalogue to use INFORMATION_SCHEMA
.
And another workaround can be using some Pyspark code in between the SQL cells to store the column details in a temporary view and using it to get the count.
My Sample:
%python
df=spark.sql('show columns from sample1')
df.createOrReplaceTempView("columns_sample1")
%sql
select count(*) as count from columns_sample1;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论