执行在HIVE中的查询(多个/子查询)出现了”Vertex Failed”错误。

huangapple go评论69阅读模式
英文:

Executing query (multiple/ subquery) in HIVE is giving Vertex Failed

问题

我对HIVE和SQL查询有基本的了解。因此,我有多个单独的查询或者可以说子查询的查询。但是,如果我同时运行5-6个查询,它可以正常工作,但如果我运行整个查询(99个单独的查询),它会给我一个"Vertex Failed Error"错误。

查询:

Select "TABLE_NAME_1" as TBL_NAME, (Select count(*) from a_t1 where BDATE='2023-05-31') as a, (Select count(*) from b_t1 where BDATE='2023-05-31') as b, (Select count(*) from c_t1 where BDATE='2023-05-31') as c UNION ALL
Select "TABLE_NAME_2" as TBL_NAME, (Select count(*) from a_t2 where BDATE='2023-05-31') as a, (Select count(*) from b_t2 where BDATE='2023-05-31') as b, (Select count(*) from c_t2 where BDATE='2023-05-31') as c UNION ALL
Select "TABLE_NAME_3" as TBL_NAME, (Select count(*) from a_t3 where BDATE='2023-05-31') as a, (Select count(*) from b_t3 where BDATE='2023-05-31') as b, (Select count(*) from c_t3 where BDATE='2023-05-31') as c UNION ALL
Select "TABLE_NAME_4" as TBL_NAME, (Select count(*) from a_t4 where BDATE='2023-05-31') as a, (Select count(*) from b_t4 where BDATE='2023-05-31') as b, (Select count(*) from c_t4 where BDATE='2023-05-31') as c UNION ALL
Select "TABLE_NAME_5" as TBL_NAME, (Select count(*) from a_t5 where BDATE='2023-05-31') as a, (Select count(*) from b_t5 where BDATE='2023-05-31') as b, (Select count(*) from c_t5 where BDATE='2023-05-31') as c UNION ALL
Select "TABLE_NAME_6" as TBL_NAME, (Select count(*) from a_t6 where BDATE='2023-05-31') as a, (Select count(*) from b_t6 where BDATE='2023-05-31') as b, (Select count(*) from c_t6 where BDATE='2023-05-31') as c UNION ALL
.
.
.
.
.
Select "TABLE_NAME_33" as TBL_NAME, (Select count(*) from a_t33 where BDATE='2023-05-31') as a, (Select count(*) from b_t33 where BDATE='2023-05-31') as b, (Select count(*) from c_t33 where BDATE='2023-05-31') as c

不确定这是否是HIVE数据库的限制,或者我是否遗漏了什么。

注意:我还使用了以下设置:

set tez.queue.name = APPNAME;
set hive.execution.engine = tez;

已编辑
我想要的输出是:

TBL_NAME     a    b   c
TABLE_NAME1  10   20  15
TABLE_NAME2  10   20  15
TABLE_NAME3  10   20  15
.
.
.
TABLE_NAME33  10   20  15

谢谢。

英文:

I have a basic understanding of HIVE and SQL Query. So the query that i have has multiple individual query or you can say sub query. However, if i run 5-6 query together it is working fine but if i run the whole query (99 individual query) it is giving me Vertex Failed Error

Query:

Select "TABLE_NAME_2" as TBL_NAME,	(Select count(*)  from a_t2 where BDATE='2023-05-31') as  a,	(Select count(*)  from b_t2 where BDATE='2023-05-31') as  b,	(Select count(*)  from c_t2 where BDATE='2023-05-31') as  c 	UNION ALL
Select "TABLE_NAME_3" as TBL_NAME,	(Select count(*)  from a_t3 where BDATE='2023-05-31') as  a,	(Select count(*)  from b_t3 where BDATE='2023-05-31') as  b,	(Select count(*)  from c_t3 where BDATE='2023-05-31') as  c 	UNION ALL
Select "TABLE_NAME_4" as TBL_NAME,	(Select count(*)  from a_t4 where BDATE='2023-05-31') as  a,	(Select count(*)  from b_t4 where BDATE='2023-05-31') as  b,	(Select count(*)  from c_t4 where BDATE='2023-05-31') as  c 	UNION ALL
Select "TABLE_NAME_5" as TBL_NAME,	(Select count(*)  from a_t5 where BDATE='2023-05-31') as  a,	(Select count(*)  from b_t5 where BDATE='2023-05-31') as  b,	(Select count(*)  from c_t5 where BDATE='2023-05-31') as  c 	UNION ALL
Select "TABLE_NAME_6" as TBL_NAME,	(Select count(*)  from a_t6 where BDATE='2023-05-31') as  a,	(Select count(*)  from b_t6 where BDATE='2023-05-31') as  b,	(Select count(*)  from c_t6 where BDATE='2023-05-31') as  c 	UNION ALL
.
.
.
.
.
Select "TABLE_NAME_33" as TBL_NAME,	(Select count(*)  from a_t33 where BDATE='2023-05-31') as  a,	(Select count(*)  from b_t33 where BDATE='2023-05-31') as  b,	(Select count(*)  from c_t33 where BDATE='2023-05-31') as  c

Not sure if this is the limitation of HIVE DB or if i am missing something.

Note: I have also used

set tez.queue.name = APPNAME; 
set hive.execution.engine = tez; 

EDITED
Output i want is:

TBL_NAME     a    b   c
TABLE_NAME1  10   20  15
TABLE_NAME2  10   20  15
TABLE_NAME3  10   20  15
.
.
.
TABLE_NAME33  10   20  15

Thanks

答案1

得分: 2

我可以给你一些解决方案。就像 @nbk 说的那样,语法是错误的,这种内联 SQL 在 Hive 中不起作用。但这将是一个昂贵的解决方案 - 尝试在这个表中寻找一些分区,并查看是否可以统计分区行数。

以下是您可以尝试的内容 -

WITH 
TABLE_NAME_2_a as Select "TABLE_NAME_2" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from a_t2 ,
TABLE_NAME_2_b as Select "TABLE_NAME_2" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from b_t2 ,
TABLE_NAME_2_c as Select "TABLE_NAME_2" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from c_t2 ,
TABLE_NAME_3_a as Select "TABLE_NAME_3" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from a_t3 ,
TABLE_NAME_3_b as Select "TABLE_NAME_3" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from b_t3 ,
TABLE_NAME_3_c as Select "TABLE_NAME_3" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from c_t3 

--UNION for Table 2
select 
TABLE_NAME_2_a.TBL_NAME, TABLE_NAME_2_a.a, TABLE_NAME_2_b.b,TABLE_NAME_2_c.c
from 
TABLE_NAME_2_a,TABLE_NAME_2_b,TABLE_NAME_2_c 
where 
TABLE_NAME_2_a.TBL_NAME=TABLE_NAME_2_b.TBL_NAME and TABLE_NAME_2_a.TBL_NAME=TABLE_NAME_2_c.TBL_NAME 
union all 
--UNION for Table 3
select 
TABLE_NAME_3_a.TBL_NAME, TABLE_NAME_3_a.a, TABLE_NAME_3_b.b,TABLE_NAME_3_c.c
from       
TABLE_NAME_3_a,TABLE_NAME_3_b,TABLE_NAME_3_c 
where      
TABLE_NAME_3_a.TBL_NAME=TABLE_NAME_3_b.TBL_NAME and TABLE_NAME_3_a.TBL_NAME=TABLE_NAME_3_c.TBL_NAME 
英文:

i can give you some solution. Like @nbk said syntax is wrong and this kind of inline SQL doesnt wotk in hive. But this is going to be an expensive solution - seek for some partitions in this table and see if you can count partitioned rows.

Here is something you can try -

WITH 
TABLE_NAME_2_a as Select "TABLE_NAME_2" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from a_t2 ,
TABLE_NAME_2_b as Select "TABLE_NAME_2" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from b_t2 ,
TABLE_NAME_2_c as Select "TABLE_NAME_2" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from c_t2 ,
TABLE_NAME_3_a as Select "TABLE_NAME_3" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from a_t3 ,
TABLE_NAME_3_b as Select "TABLE_NAME_3" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from b_t3 ,
TABLE_NAME_3_c as Select "TABLE_NAME_3" as TBL_NAME,  SUM(case when BDATE='2023-05-31' then 1 else 0 end) as  a from c_t3 

--UNION for Table 2
select 
TABLE_NAME_2_a.TBL_NAME, TABLE_NAME_2_a.a, TABLE_NAME_2_b.b,TABLE_NAME_2_c.c
from 
TABLE_NAME_2_a,TABLE_NAME_2_b,TABLE_NAME_2_c 
where 
TABLE_NAME_2_a.TBL_NAME=TABLE_NAME_2_b.TBL_NAME and TABLE_NAME_2_a.TBL_NAME=TABLE_NAME_2_c.TBL_NAME 
union all 
--UNION for Table 3
select 
TABLE_NAME_3_a.TBL_NAME, TABLE_NAME_3_a.a, TABLE_NAME_3_b.b,TABLE_NAME_3_c.c
from       
TABLE_NAME_3_a,TABLE_NAME_3_b,TABLE_NAME_3_c 
where      
TABLE_NAME_3_a.TBL_NAME=TABLE_NAME_3_b.TBL_NAME and TABLE_NAME_3_a.TBL_NAME=TABLE_NAME_3_c.TBL_NAME 
 


</details>



huangapple
  • 本文由 发表于 2023年6月8日 23:54:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/76433681.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定