fastest way to read and iterate over the rows of a table, and read/call a function with corresponding arguments and raw data from each row?

huangapple go评论75阅读模式
英文:

fastest way to read and iterate over the rows of a table, and read/call a function with corresponding arguments and raw data from each row?

问题

我有两张表格。

第一张表格是一个名为tab的表格,其中包含了function_name和args列,如下所示:

tab:flip `function`root`args!(`symbol$();();());
insert[`tab](`.qa.test1;"ES";`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2010.12.31;`ES;0.8;10;1));
insert[`tab](`.qa.test1;"NQ;SI";`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2013.12.31;`NQ`SI;0.8;20;1));
insert[`tab](`.qa.test1;"ALL";`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2013.12.31;`HQ`ZK`NA`PL`BZ;0.8;12;2));

第二张表格是一个名为raw的原始数据表格,如下所示:

//创建示例数据
raw:([]date:`date$();sym:`symbol$();name:`symbol$();expiration_date:`date$());
//修改数据以模拟要求
{`raw insert (x;`ESH10;`someName1;2010.01.10)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH11;`someName2;2010.01.15)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH12;`someName3;2010.01.20)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH13;`someName4;2010.01.16)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH14;`someName5;2010.01.13)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH15;`someName6;2010.01.28)} each (2010.01.01 + til 31);

现在,我想要迭代表格tab的每一行,调用function列中编写的函数,同时将args和raw表格作为输入。类似于以下方式:

.qa.test1[(`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2010.12.31d;`ES;0.8;10;1));raw]
.qa.test1[(`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2013.12.31d;`NQ`SI;0.8;20;1));raw]
.qa.test1[(`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2013.12.31d;`HQ`ZK`NA`PL`BZ;0.8;12;2));raw]

我提出的解决方案如下,虽然能够工作,但速度很慢。是否有更好的方法在kdb+中完成这个任务?

.qa.get_tab_main:{[raw]
    // raw是具有原始数据的表格,.qa.test1接受两个参数.qa.test1[args;raw]
    // 遍历表格行并将结果追加到tab表格中
    cnt:(count tab);
    ino:0;
    res1::flip `date`sym`name`expiration_date!(`date$();`symbol$();`symbol$();`date$()); //从.qa.test1得到最终输出表的元数据
    while[ino<cnt;
     data:exec .[first function;(args;raw)] from tab where i=ino;
     upsert[`tab;(select from data)];
     ino:ino+1
     ];
    res:`date xasc res1;
    delete res1 from `.;
    res
};

请注意,我处理的数据非常庞大,包含8000000行。

【翻译完毕】

英文:

I have 2 tables with me.

1st table is tab having funtion_name and args column like below

fastest way to read and iterate over the rows of a table, and read/call a function with corresponding arguments and raw data from each row?

tab:flip `function`root`args!(`symbol$();();());
insert[`tab](`.qa.test1;&quot;ES&quot;;`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2010.12.31;`ES;0.8;10;1));
insert[`tab](`.qa.test1;&quot;NQ;SI&quot;;`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2013.12.31;`NQ`SI;0.8;20;1));
insert[`tab](`.qa.test1;&quot;ALL&quot;;`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2013.12.31;`HQ`ZK`NA`PL`BZ;0.8;12;2));

2nd table is raw having raw data like below

//create sample data
raw:([]date:`date$();sym:`symbol$();name:`symbol$();expiration_date:`date$());
//modify data to simulate requirement 
{`raw insert (x;`ESH10;`someName1;2010.01.10)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH11;`someName2;2010.01.15)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH12;`someName3;2010.01.20)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH13;`someName4;2010.01.16)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH14;`someName5;2010.01.13)} each (2010.01.01 + til 31);
{`raw insert (x;`ESH15;`someName6;2010.01.28)} each (2010.01.01 + til 31);

Now I want Iterate each row of table tab to call the function written in the function column along with the args + raw table as input
some like

.qa.test1[(`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2010.12.31d;`ES;0.8;10;1));raw]
.qa.test1[(`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2013.12.31d;`NQ`SI;0.8;20;1));raw]
.qa.test1[(`startDate`endDate`roots`rollPerct`expDateThreshold`expDateThresholdExpiry!(2010.01.01d;2013.12.31d;`HQ`ZK`NA`PL`BZ;0.8;12;2));raw]

The solution I came up with is below, its working but its very slow.

Is there a Better Way of doing this in kdb+?

Note: The data I am dealing with is huge having 8000000 rows.

//solution which is slow
.qa.get_tab_main:{[raw]
//raw is the table having raw data, .qa.test1 accepts 2 args .qa.test1[args;raw] 
    //iterate over the table rows and append the result to tab table
    cnt:(count tab);
    ino:0;
    res1::flip `date`sym`name`expiration_date!(`date$();`symbol$();`symbol$();`date$()); //meta of final output table from .qa.test1
    while[ino&lt;cnt;
     data:exec .[first function;(args;raw)] from tab where i=ino;
     upsert[`tab;(select from data)];
     ino:ino+1
     ];
    res:`date xasc res1;
    delete res1 from `.;
    res
    };

答案1

得分: 2

q){x[`function] .(x[`args];raw)}each tab

将结果捕获到一个单一的表格中:

q)res:raze {x[`function] .(x[`args];raw)} each tab

根据函数内部涉及的逻辑,您可以尝试使用peach来实现多核加速。

https://code.kx.com/q/basics/peach/


<details>
<summary>英文:</summary>

https://code.kx.com/q/ref/apply/

```q
q){x[`function] .(x[`args];raw)}each tab

To capture results in a single table:

q)res:raze {x[`function] .(x[`args];raw)} each tab

Depending on logic involved inside your functions you may be able to use peach for multicore speedup.

https://code.kx.com/q/basics/peach/

答案2

得分: 1

使用qsql:

q)从tab中执行exec value each(function,'(enlist each args),\:enlist raw) from tab

如果您想要单个表输出,可以使用raze

英文:

Using qsql:

q)exec value each(function,&#39;(enlist each args),\:enlist raw) from tab

You can raze the result if you want a single table output.

huangapple
  • 本文由 发表于 2023年6月16日 00:40:47
  • 转载请务必保留本文链接:https://go.coder-hub.com/76483818.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定