从另一个字符串列中获取数字,创建单独的列。

huangapple go评论65阅读模式
英文:

create separate column by getting Numbers from the another string column in kdb+q

问题

我是kdb新手,想请教以下问题。假设我有一个表格:

t:([] price:100 200; Text:("Selling price 100 and purchase price 200 threshold 250"; "Selling price 200 and purchase price 300"))

从文本中"Selling price 100 and purchase price 200 threshold is 250",我想要创建两个新的列,一个是销售价格(Selling price),另一个是购买价格(Purchase price)。

我一直在尝试首先处理字符串,我可以得到索引,但在正确获取数字方面遇到了困难。不确定这是否是一个高效的解决方案。

b:"Selling price 100 and purchase price 200 threshold 250";

q)" " vs string (first ss[b;"Selling price "])+count "Selling price "

q)"14"

从另一个字符串列中获取数字,创建单独的列。

英文:

Hi I'm new to kdb and I wanted help with below question
Suppose I have a table

t:([] price:100 200; Text:("Selling price 100 and purchase price 200 threshold 250";"Selling price 200 and purchase price 300"))

from the text "Selling price 100 and purchase price 200 threshold is 250"
I want to have two new columns with selling price and purchase price.

从另一个字符串列中获取数字,创建单独的列。

I have been trying to work on strings first and I could get the index but having trouble with getting numbers correctly. not sure if that would be an efficient solution either.

b:"Selling price 100 and purchase price 200 threshold 250"

q)" " vs string (first ss[b;"Selling price "])+count "Selling price "

q)"14"

答案1

得分: 2

假设Text列的结构始终相同:

q)t,'flip`SellingPrice`PurchasePrice!("  J   J";" ")0:t`Text
price Text                                                     SellingPrice PurchasePrice
-----------------------------------------------------------------------------------------
100   "销售价格 100 和采购价格 200 阈值 250" 100          200
200   "销售价格 200 和采购价格 300"               200          300
英文:

Assuming the Text column will always have the same structure:

q)t,'flip`SellingPrice`PurchasePrice!("  J   J";" ")0:t`Text
price Text                                                     SellingPrice PurchasePrice
-----------------------------------------------------------------------------------------
100   "Selling price 100 and purchase price 200 threshold 250" 100          200
200   "Selling price 200 and purchase price 300"               200          300

答案2

得分: 1

基本版本,假设每段文本中的顺序都相同:

```q
t,'exec {`SellingPrice`PurchasePrice!2#{x where not null x}"J"$" " vs @[x;where not x in .Q.n;:;" "]} each Text from t
price Text                                                     SellingPrice PurchasePrice
-----------------------------------------------------------------------------------------
100   "Selling price 100 and purchase price 200 threshold 250" 100          200
200   "Selling price 200 and purchase price 300"               200          300

<details>
<summary>英文:</summary>

Basic version assuming the order is the same in each piece of text:

```q
t,&#39;exec {`SellingPrice`PurchasePrice!2#{x where not null x}&quot;J&quot;$&quot; &quot; vs @[x;where not x in .Q.n;:;&quot; &quot;]} each Text from t
price Text                                                     SellingPrice PurchasePrice
-----------------------------------------------------------------------------------------
100   &quot;Selling price 100 and purchase price 200 threshold 250&quot; 100          200
200   &quot;Selling price 200 and purchase price 300&quot;               200          300

答案3

得分: 1

你可以通过创建一个函数来进行一般化,该函数接受一个字符串和一个字符串模式列表,然后返回在字符串中出现任何模式后的第一个单词。

q) findFirstWordAfterPatterns: { [txt; patterns] : { first " " vs (1 + count[string y `pattern] + y `indexes) _ x }[txt;] each ungroup flip `pattern`indexes! flip  (`$patterns) ,' enlist each ss[txt;] each patterns; };
q) findFirstWordAfterPatterns["foo bar hello world foo bar"; ("foo"; "hello")]
"bar"
"bar"
"world"

然后,你可以在表上运行一个更新命令,通过调用这个函数对Text列添加两个新列:

q)buyPatterns: ("purchase price"; "buy price");
q)sellPatterns: enlist "Selling price"
q) t:([] price:100 200 300; Text:("Selling price 100 and purchase price 200 threshold 250";"Selling price 200 and purchase price 300"; "blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah"))
q)update SellingPrices:findFirstWordAfterPatterns[; sellPatterns]'[Text], PurchasePrices:findFirstWordAfterPatterns[; buyPatterns]'[Text] from t
price Text                                                                                                              SellingPrices PurchasePrices
---------------------------------------------------------------------------------------------------------------------------------------------------------
100   "Selling price 100 and purchase price 200 threshold 250"                                                          ,"100"        ,"200"
200   "Selling price 200 and purchase price 300"                                                                        ,"200"        ,"300"
300   "blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah" ()            ("100";"300";"200")

如果你只想要第一个出现的情况,并且要将它们从字符串转换为数字,你可以使用first each"F"$来调整更新语句(你也可以使用"I"$来转换为整数):

q)update SellingPrices:first each "F"$findFirstWordAfterPatterns[; sellPatterns]'[Text], PurchasePrices:first each "F"$findFirstWordAfterPatterns[; buyPatterns]'[Text] from t
price Text                                                                                                              SellingPrices PurchasePrices
----------------------------------------------------------------------------------------------------------------------------------------------------
100   "Selling price 100 and purchase price 200 threshold 250"                                                          100           200
200   "Selling price 200 and purchase price 300"                                                                        200           300
300   "blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah"               100
英文:

you can generalize this a bit by creating a function that takes a string and a list of string patterns and returns the first word following the occurrence of any of the patterns in the string

q) findFirstWordAfterPatterns: { [txt; patterns] : { first &quot; &quot; vs (1 + count[string y `pattern] + y `indexes) _ x }[txt;] each ungroup flip `pattern`indexes! flip  (`$patterns) ,&#39; enlist each ss[txt;] each patterns; };
q) findFirstWordAfterPatterns[&quot;foo bar hello world foo bar&quot;; (&quot;foo&quot;; &quot;hello&quot;)]
&quot;bar&quot;
&quot;bar&quot;
&quot;world&quot;

you can then run an update command on your table to add two new columns that are populated by calling this function against the Text column:

q)buyPatterns:  (&quot;purchase price&quot;; &quot;buy price&quot;);
q)sellPatterns: enlist &quot;Selling price&quot;
q) t:([] price:100 200 300; Text:(&quot;Selling price 100 and purchase price 200 threshold 250&quot;;&quot;Selling price 200 and purchase price 300&quot;; &quot;blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah&quot;))
q)update SellingPrices:findFirstWordAfterPatterns[; sellPatterns]&#39;[Text],    PurchasePrices:findFirstWordAfterPatterns[; buyPatterns]&#39;[Text] from t
price Text                                                                                                              SellingPrices PurchasePrices
---------------------------------------------------------------------------------------------------------------------------------------------------------
100   &quot;Selling price 100 and purchase price 200 threshold 250&quot;                                                          ,&quot;100&quot;        ,&quot;200&quot;
200   &quot;Selling price 200 and purchase price 300&quot;                                                                        ,&quot;200&quot;        ,&quot;300&quot;
300   &quot;blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah&quot; ()            (&quot;100&quot;;&quot;300&quot;;&quot;200&quot;)

if you only want the first such occurrence, and you wants them converted from string to number, you can tweak the update statement using first each and "F"$ to convert to float (you could instead use "I"$ to convert to integer):

q)update SellingPrices:first each &quot;F&quot;$findFirstWordAfterPatterns[; sellPatterns]&#39;[Text], PurchasePrices:first each &quot;F&quot;$findFirstWordAfterPatterns[; buyPatterns]&#39;[Text] from t
price Text                                                                                                              SellingPrices PurchasePrices
----------------------------------------------------------------------------------------------------------------------------------------------------
100   &quot;Selling price 100 and purchase price 200 threshold 250&quot;                                                          100           200
200   &quot;Selling price 200 and purchase price 300&quot;                                                                        200           300
300   &quot;blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah&quot;               100

huangapple
  • 本文由 发表于 2023年7月12日 20:32:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76670580.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定