PLSQL Execute Immediate动态聚合查询和批量收集

huangapple go评论116阅读模式
英文:

PLSQL Execute Immediate Dynamic Aggregate Query and Bulk Collect

问题

我需要提取表格中每列的人口数量(每个列中的计数)。我正在编写一个PLSQL脚本,该脚本将创建一个动态查询并打印结果。由于我的选择查询包括聚合列,我不确定如何定义TYPE。是否应使用游标或执行immediate和bulk collect。

DECLARE
   V_COLUMN_NAME ALL_TAB_COLS.COLUMN_NAME%type;
   V_SELECT_POPULATION_QUERY varchar2(2000) := 'SELECT ';
   V_TABLE_NAME varchar2(200) := 'EMPLOYEE'; // 将来将是表的列表
   CURSOR TAB_COLS IS SELECT COLUMN_NAME FROM ALL_TAB_COLS WHERE TABLE_NAME = V_TABLE_NAME;

BEGIN
   OPEN TAB_COLS;
   LOOP
      FETCH TAB_COLS INTO V_COLUMN_NAME;
      EXIT WHEN TAB_COLS%notfound;
      V_SELECT_POPULATION_QUERY := V_SELECT_POPULATION_QUERY || 'COUNT('|| V_COLUMN_NAME||'),';
   END LOOP;
  V_SELECT_POPULATION_QUERY:= SUBSTR(V_SELECT_POPULATION_QUERY, 0,LENGTH( V_SELECT_POPULATION_QUERY )-1);
  V_SELECT_POPULATION_QUERY:= V_SELECT_POPULATION_QUERY || ' FROM '|| V_TABLE_NAME; 
  

 //需要一种执行动态查询(变量V_SELECT_POPULATION_QUERY)并打印结果的方法。
 //示例动态查询1SELECT COUNT(EMPLOYEE_ID),COUNT(NAME),COUNT(BASE_URL),COUNT(INPUT_FORM),COUNT(BASE_URL_ASUCASE),COUNT(IS_CUSTOM) FROM EMPLOYEE
 //示例动态查询2SELECT COUNT(DEPT_ID),COUNT(NAME),COUNT(BASE_URL) FROM DEPT
END;
英文:

I need to fetch the column's population (count of each column in a table) for many tables. I'm writing a PLSQL script which will create a dynamic query and want to print the result. Since my select query includes aggregate columns, I'm unsure how to define TYPE. Whether to use a cursor or execute immediate & bulk collect.

DECLARE
   V_COLUMN_NAME ALL_TAB_COLS.COLUMN_NAME%type;
   V_SELECT_POPULATION_QUERY varchar2(2000) := 'SELECT ';
   V_TABLE_NAME varchar2(200) := 'EMPLOYEE'; // Will be list of tables in future
   CURSOR TAB_COLS IS SELECT COLUMN_NAME FROM ALL_TAB_COLS WHERE TABLE_NAME = V_TABLE_NAME;
  
BEGIN
   OPEN TAB_COLS;
   LOOP
	  FETCH TAB_COLS INTO V_COLUMN_NAME;
	  EXIT WHEN TAB_COLS%notfound;
	  V_SELECT_POPULATION_QUERY := V_SELECT_POPULATION_QUERY ||'COUNT('|| V_COLUMN_NAME||'),';
   END LOOP;
  V_SELECT_POPULATION_QUERY:= SUBSTR(V_SELECT_POPULATION_QUERY, '0',LENGTH( V_SELECT_POPULATION_QUERY )-1);
  V_SELECT_POPULATION_QUERY:= V_SELECT_POPULATION_QUERY || ' FROM '|| V_TABLE_NAME; 
  

 //NEED A WAY TO EXECUTE DYNAMIC QUERY (variable V_SELECT_POPULATION_QUERY) and print the results.
 //SAMPLE DYNAMIC QUERY1 : SELECT COUNT(EMPLOYEE_ID),COUNT(NAME),COUNT(BASE_URL),COUNT(INPUT_FORM),COUNT(BASE_URL_ASUCASE),COUNT(IS_CUSTOM) FROM EMPLOYEE 
 //SAMPLE DYNAMIC QUERY 2: SELECT COUNT(DEPT_ID),COUNT(NAME),COUNT(BASE_URL) FROM DEPT
END;

答案1

得分: 0

这里是一个选项。

首先创建类型:

SQL> create or replace type t_row as object (table_name varchar2(30), column_name varchar2(30), cnt number);
2 /

类型已创建。

SQL> create or replace type t_tab as table of t_row;
2 /

类型已创建。

函数接受表名作为参数;然后在循环中计算该表中的每个列,将结果存储在一个返回值为return的集合中:

SQL> create or replace function f_test (par_table_name in varchar2)
2 return t_tab
3 is
4 l_str varchar2(1000);
5 l_cnt number;
6 l_tab t_tab := t_tab();
7 begin
8 for cur_r in (select table_name, column_name, column_id
9 from user_tab_columns
10 where table_name = dbms_assert.sql_object_name(par_table_name)
11 )
12 loop
13 l_str := 'select count(' || cur_r.column_name || ') from ' ||cur_r.table_name;
14 execute immediate l_str into l_cnt;
15
16 l_tab.extend;
17 l_tab(cur_r.column_id) := t_row(cur_r.table_name, cur_r.column_name, l_cnt);
18 end loop;
19
20 return l_tab;
21 end;
22 /

函数已创建。

测试:

SQL> select * from table(f_test('EMP'));

TABLE_NAME COLUMN_NAME CNT


EMP EMPNO 14
EMP ENAME 14
EMP JOB 14
EMP MGR 13
EMP HIREDATE 14
EMP SAL 14
EMP COMM 4
EMP DEPTNO 14

8 rows selected.

SQL>;


你在代码中的注释中提到将来会使用“表的列表”。好的,函数仍然只接受一个参数,但是你可以传递任意数量的表。

例如:

SQL> select f.*
2 from user_tables t cross join table(f_test(t.table_name)) f
3 where t.table_name in ('DEPT', 'ORDERS');

TABLE_NAME COLUMN_NAME CNT


DEPT DEPTNO 4
DEPT DNAME 4
DEPT LOC 4
ORDERS STATE 3

SQL>;

英文:

Here's one option.

Create types first:

SQL> create or replace type t_row as object (table_name varchar2(30), column_name varchar2(30), cnt number);
  2  /

Type created.

SQL> create or replace type t_tab as table of t_row;
  2  /

Type created.

Function accepts table name as a parameter; then - in a loop - counts each column in that table, stores result into a collection which is return value:

SQL> create or replace function f_test (par_table_name in varchar2)
  2    return t_tab
  3  is
  4    l_str varchar2(1000);
  5    l_cnt number;
  6    l_tab t_tab := t_tab();
  7  begin
  8    for cur_r in (select table_name, column_name, column_id
  9                  from user_tab_columns
 10                  where table_name = dbms_assert.sql_object_name(par_table_name)
 11                 )
 12    loop
 13      l_str := 'select count(' || cur_r.column_name || ') from ' ||cur_r.table_name;
 14      execute immediate l_str into l_cnt;
 15
 16      l_tab.extend;
 17      l_tab(cur_r.column_id) := t_row(cur_r.table_name, cur_r.column_name, l_cnt);
 18    end loop;
 19
 20    return l_tab;
 21  end;
 22  /

Function created.

Testing:

SQL> select * from table(f_test('EMP'));

TABLE_NAME                     COLUMN_NAME                           CNT
------------------------------ ------------------------------ ----------
EMP                            EMPNO                                  14
EMP                            ENAME                                  14
EMP                            JOB                                    14
EMP                            MGR                                    13
EMP                            HIREDATE                               14
EMP                            SAL                                    14
EMP                            COMM                                    4
EMP                            DEPTNO                                 14

8 rows selected.

SQL>

You said (as comment in code you posted) that you'll use "list of tables" in the future. OK; function still accepts only one parameter, but it is you who should pass as many tables as you want.

For example:

SQL> select f.*
  2  from user_tables t cross join table(f_test(t.table_name)) f
  3  where t.table_name in ('DEPT', 'ORDERS');

TABLE_NAME                     COLUMN_NAME                           CNT
------------------------------ ------------------------------ ----------
DEPT                           DEPTNO                                  4
DEPT                           DNAME                                   4
DEPT                           LOC                                     4
ORDERS                         STATE                                   3

SQL>

答案2

得分: 0

以下是您要翻译的内容:

"The problem you're going to have with this is the varying column list you want returned each time. So let's say you fix your immediate code concern with EXECUTE IMMEDIATE:

EXECUTE IMMEDIATE V_SELECT_POPULATION_QUERY INTO var_employee_id,var_name,var_base_url,var_input_form,var_base_url_asucase,var_is_custom;

But then for the next one, you would need a different call:

EXECUTE IMMEDIATE V_SELECT_POPULATION_QUERY INTO var_dept_id,var_name,var_base_url;

Clearly you can't do this in a loop, since while the SQL statement itself is dynamic, the variables you fetch into are not. Reused dynamic SQL is expected to return the same structure in all cases, but you are changing the structure each time.

The most straight forward way to fix this is to query each column separately rather than trying to do all the columns in the table at once. That way, you select every column into a single number variable.

EXECUTE IMMEDIATE V_SELECT_POPULATION_QUERY INTO var_column_count; -- one column only

But let's say that is a performance burden to you. If you really must combine all the columns in a single pass, then you have two approaches.

  1. Determine the maximum number of columns you will ever have to query (hundreds, perhaps?). Create a record variable with that many number columns. While assembling your COUNT(column_name) list, count how many of these you append and then append dummy NULL columns however many times is needed to fill out the number of columns in you record type.

OR

  1. Use a double nested dynamic SQL approach, where the entire PL/SQL block containing the dynamic SQL is itself dynamic SQL. Obviously you need the results to do something, so you'd have to process the results inside that block. Let's say you just need to enter them into a table.

     create table my_results (column_name varchar2(128),count_result integer);
     /
     DECLARE
       var_column_list varchar2(32767);
       var_variable_list varchar2(32767);
       var_plsql_block varchar2(32767);
       TYPE columns_tabtype IS TABLE OF varchar2(128);
       tab_columns columns_tabtype := columns_tabtype();
       var_table_name varchar2(128) := 'EMPLOYEE';
     BEGIN
       FOR rec_column IN (SELECT column_name,ROWNUM seq
                            FROM user_tab_columns
                           WHERE table_Name = var_table_name)
       LOOP
         var_column_list := var_column_list || 'COUNT('||rec_column.column_name||'),';
         var_variable_list := var_variable_list || 'C'||rec_column.seq||',';
         tab_columns.EXTEND;
         tab_columns(tab_columns.LAST) := rec_column.column_name;
       END LOOP;
    
       var_plsql_block := '
         DECLARE
           '||REPLACE(var_variable_list, ',', ' number;'||CHR(10))||'
         BEGIN
           EXECUTE IMMEDIATE ''SELECT '||RTRIM(var_column_list,',')||' FROM '||var_table_name||''' INTO '||RTRIM(var_variable_list,',')||';';
    
           FOR j IN tab_columns.FIRST .. tab_columns.LAST
           LOOP
             var_plsql_block := var_plsql_block ||
               'INSERT INTO my_results (column_name,count_result) VALUES ('''||tab_columns(j)||''',c'||j||');';
           END LOOP;
    
           var_plsql_block := var_plsql_block ||'
         END;
       ';
    
       --dbms_output.put_line(var_plsql_block); -- enable for debugging
       EXECUTE IMMEDIATE var_plsql_block;
     END;
     /
    
     SELECT *
       FROM my_results
    

Having answered your question, I would end by asking whether you are approaching this problem correctly. You are essentially counting the number of non-NULL rows for each column. If that's all you need, then simply gather statistics on the table and you have your answers in user_tab_columns (the row_count from user_tables minus the num_nulls from user_tab_columns). That's a lot easier than you doing it yourself.

If you are instead trying to get the number of distinct values, then you'd have to add DISTINCT to each of your COUNT() functions, but again, it is easier to get that from num_distinct in user_tab_columns than to gather this info yourself. I've never in all my years had to do what you're trying to do, so maybe this is a problem you don't really need to solve?"

英文:

The problem you're going to have with this is the varying column list you want returned each time. So let's say you fix your immediate code concern with EXECUTE IMMEDIATE:

EXECUTE IMMEDIATE V_SELECT_POPULATION_QUERY INTO var_employee_id,var_name,var_base_url,var_input_form,var_base_url_asucase,var_is_custom;

But then for the next one, you would need a different call:

EXECUTE IMMEDIATE V_SELECT_POPULATION_QUERY INTO var_dept_id,var_name,var_base_url;

Clearly you can't do this in a loop, since while the SQL statement itself is dynamic, the variables you fetch into are not. Reused dynamic SQL is expected to return the same structure in all cases, but you are changing the structure each time.

The most straight forward way to fix this is to query each column separately rather than trying to do all the columns in the table at once. That way, you select every column into a single number variable.

EXECUTE IMMEDIATE V_SELECT_POPULATION_QUERY INTO var_column_count; -- one column only

But let's say that is a performance burden to you. If you really must combine all the columns in a single pass, then you have two approaches.

  1. Determine the maximum number of columns you will ever have to query (hundreds, perhaps?). Create a record variable with that many number columns. While assembling your COUNT(column_name) list, count how many of these you append and then append dummy NULL columns however many times is needed to fill out the number of columns in you record type.

OR

  1. Use a double nested dynamic SQL approach, where the entire PL/SQL block containing the dynamic SQL is itself dynamic SQL. Obviously you need the results to do something, so you'd have to process the results inside that block. Let's say you just need to enter them into a table.

     create table my_results (column_name varchar2(128),count_result integer);
     /
     DECLARE
       var_column_list varchar2(32767);
       var_variable_list varchar2(32767);
       var_plsql_block varchar2(32767);
       TYPE columns_tabtype IS TABLE OF varchar2(128);
       tab_columns columns_tabtype := columns_tabtype();
       var_table_name varchar2(128) := 'EMPLOYEE';
     BEGIN
       FOR rec_column IN (SELECT column_name,ROWNUM seq
                            FROM user_tab_columns
                           WHERE table_Name = var_table_name)
       LOOP
         var_column_list := var_column_list || 'COUNT('||rec_column.column_name||'),';
         var_variable_list := var_variable_list || 'C'||rec_column.seq||',';
         tab_columns.EXTEND;
         tab_columns(tab_columns.LAST) := rec_column.column_name;
       END LOOP;
    
       var_plsql_block := '
         DECLARE
           '||REPLACE(var_variable_list,',',' number;'||CHR(10))||'
         BEGIN
           EXECUTE IMMEDIATE ''SELECT '||RTRIM(var_column_list,',')||' FROM '||var_table_name||''' INTO '||RTRIM(var_variable_list,',')||';';
    
           FOR j IN tab_columns.FIRST .. tab_columns.LAST
           LOOP
             var_plsql_block := var_plsql_block ||'
               INSERT INTO my_results (column_name,count_result) VALUES ('''||tab_columns(j)||''',c'||j||');';
           END LOOP;
    
           var_plsql_block := var_plsql_block ||'
         END;
       ';
    
       --dbms_output.put_line(var_plsql_block); -- enable for debugging
       EXECUTE IMMEDIATE var_plsql_block;
     END;
     /
    
     SELECT *
       FROM my_results
    

Having answered your question, I would end by asking whether you are approaching this problem correctly. You are essentially counting the number of non-NULL rows for each column. If that's all you need, then simply gather statistics on the table and you have your answers in user_tab_columns (the row_count from user_tables minus the num_nulls from user_tab_columns). That's a lot easier than you doing it yourself.

If you are instead trying to get the number of distinct values, then you'd have to add DISTINCT to each of your COUNT() functions, but again, it is easier to get that from num_distinct in user_tab_columns than to gather this info yourself. I've never in all my years had to do what you're trying to do, so maybe this is a problem you don't really need to solve?

答案3

得分: 0

为了统计表中的每列,您无需任何PL/SQL脚本。Oracle已经在对象统计信息中为您进行了计数 - 要么使用抽样近似值(对于大型表来说完全可以接受),要么精确扫描整个表

从技术上讲,您需要表的行数,以及每个列中的NULL值的数量,然后将它们相减。

小例子

create table tab as 
select 1 col1, 0 col2, 'X' col3 from dual union all
select 1, null, 'Y' from dual union all
select 3, null, null from dual;

select count(col1), count(col2), count(col3) from tab;
COUNT(COL1) COUNT(COL2) COUNT(COL3)
----------- ----------- -----------
          3         1         2

使用100%的精确(无抽样)方式收集统计信息:

BEGIN 
   dbms_stats.gather_table_stats(ownname=>user, tabname=>'TAB', estimate_percent=>100);
END;
/

现在,使用数据字典视图来获取表NUM_ROWS和列NUM_NULLS,然后相减以获得结果。

select t.TABLE_NAME, t.NUM_ROWS, c.COLUMN_NAME, c.NUM_DISTINCT, c.NUM_NULLS, t.NUM_ROWS -  c.NUM_NULLS CNT_NOT_NULL
from user_tab_columns c
join user_tables t on t.table_name = c.table_name
where t.table_name = 'TAB';

TABLE_NAME NUM_ROWS COLUMN_NAME NUM_DISTINCT NUM_NULLS CNT_NOT_NULL
---------- -------- ----------- ------------ --------- ------------
TAB              3 COL1                    2         0              3
TAB              3 COL2                    1         2              1
TAB              3 COL3                    2         1              2

请注意,作为奖励,您还会获得每个列的不同值数量。您可以使用抽样来以更短的时间获得估算值,或者您可以启用并行选项来收集统计信息。

英文:

TO count each column in a table you do not need any PL/SQL script. Oracle is counting it for you already in the object statistics - either approximately with sampling - which can be perferctly OK for large tables or exactly with a scan of the whole table.

Technically you need the row count of a table and for each column the number of NULL values that you will subtract.

Small Example

create table tab as 
select 1 col1, 0 col2, 'X' col3 from dual union all
select 1,null, 'Y' from dual union all
select 3, null,null from dual;

select count(col1), count(col2), count(col3) from tab;
COUNT(COL1) COUNT(COL2) COUNT(COL3)
----------- ----------- -----------
          3           1           2

Gather statistics with 100% percent (exact, no sampling)

BEGIN 
   dbms_stats.gather_table_stats(ownname=>user, tabname=>'TAB',estimate_percent => 100);
END;
/ 

Now use the dictionary view to get the table NUM_ROWS and columns NUM_NULLS and subtract them to get the result.

select t.TABLE_NAME, t.NUM_ROWS, c.COLUMN_NAME, c.NUM_DISTINCT, c.NUM_NULLS, t.NUM_ROWS -  c.NUM_NULLS CNT_NOT_NULL
from user_tab_columns c
join user_tables t on t.table_name = c.table_name
where t.table_name = 'TAB';

TABLE_NAME   NUM_ROWS COLUMN_NAM NUM_DISTINCT  NUM_NULLS CNT_NOT_NULL
---------- ---------- ---------- ------------ ---------- ------------
TAB                 3 COL1                  2          0            3
TAB                 3 COL2                  1          2            1
TAB                 3 COL3                  2          1            2

Note that as a bonus you get also the number of distinct values of each column. You may use sampling to get estimete values in less time or you may activate a parallel option to gather statistics.

huangapple
  • 本文由 发表于 2023年2月27日 17:37:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/75578789.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定