英文:
create separate column by getting Numbers from the another string column in kdb+q
问题
我是kdb新手,想请教以下问题。假设我有一个表格:
t:([] price:100 200; Text:("Selling price 100 and purchase price 200 threshold 250"; "Selling price 200 and purchase price 300"))
从文本中"Selling price 100 and purchase price 200 threshold is 250",我想要创建两个新的列,一个是销售价格(Selling price),另一个是购买价格(Purchase price)。
我一直在尝试首先处理字符串,我可以得到索引,但在正确获取数字方面遇到了困难。不确定这是否是一个高效的解决方案。
b:"Selling price 100 and purchase price 200 threshold 250";
q)" " vs string (first ss[b;"Selling price "])+count "Selling price "
q)"14"
英文:
Hi I'm new to kdb and I wanted help with below question
Suppose I have a table
t:([] price:100 200; Text:("Selling price 100 and purchase price 200 threshold 250";"Selling price 200 and purchase price 300"))
from the text "Selling price 100 and purchase price 200 threshold is 250"
I want to have two new columns with selling price and purchase price.
I have been trying to work on strings first and I could get the index but having trouble with getting numbers correctly. not sure if that would be an efficient solution either.
b:"Selling price 100 and purchase price 200 threshold 250"
q)" " vs string (first ss[b;"Selling price "])+count "Selling price "
q)"14"
答案1
得分: 2
假设Text
列的结构始终相同:
q)t,'flip`SellingPrice`PurchasePrice!(" J J";" ")0:t`Text
price Text SellingPrice PurchasePrice
-----------------------------------------------------------------------------------------
100 "销售价格 100 和采购价格 200 阈值 250" 100 200
200 "销售价格 200 和采购价格 300" 200 300
英文:
Assuming the Text
column will always have the same structure:
q)t,'flip`SellingPrice`PurchasePrice!(" J J";" ")0:t`Text
price Text SellingPrice PurchasePrice
-----------------------------------------------------------------------------------------
100 "Selling price 100 and purchase price 200 threshold 250" 100 200
200 "Selling price 200 and purchase price 300" 200 300
答案2
得分: 1
基本版本,假设每段文本中的顺序都相同:
```q
t,'exec {`SellingPrice`PurchasePrice!2#{x where not null x}"J"$" " vs @[x;where not x in .Q.n;:;" "]} each Text from t
price Text SellingPrice PurchasePrice
-----------------------------------------------------------------------------------------
100 "Selling price 100 and purchase price 200 threshold 250" 100 200
200 "Selling price 200 and purchase price 300" 200 300
<details>
<summary>英文:</summary>
Basic version assuming the order is the same in each piece of text:
```q
t,'exec {`SellingPrice`PurchasePrice!2#{x where not null x}"J"$" " vs @[x;where not x in .Q.n;:;" "]} each Text from t
price Text SellingPrice PurchasePrice
-----------------------------------------------------------------------------------------
100 "Selling price 100 and purchase price 200 threshold 250" 100 200
200 "Selling price 200 and purchase price 300" 200 300
答案3
得分: 1
你可以通过创建一个函数来进行一般化,该函数接受一个字符串和一个字符串模式列表,然后返回在字符串中出现任何模式后的第一个单词。
q) findFirstWordAfterPatterns: { [txt; patterns] : { first " " vs (1 + count[string y `pattern] + y `indexes) _ x }[txt;] each ungroup flip `pattern`indexes! flip (`$patterns) ,' enlist each ss[txt;] each patterns; };
q) findFirstWordAfterPatterns["foo bar hello world foo bar"; ("foo"; "hello")]
"bar"
"bar"
"world"
然后,你可以在表上运行一个更新命令,通过调用这个函数对Text列添加两个新列:
q)buyPatterns: ("purchase price"; "buy price");
q)sellPatterns: enlist "Selling price"
q) t:([] price:100 200 300; Text:("Selling price 100 and purchase price 200 threshold 250";"Selling price 200 and purchase price 300"; "blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah"))
q)update SellingPrices:findFirstWordAfterPatterns[; sellPatterns]'[Text], PurchasePrices:findFirstWordAfterPatterns[; buyPatterns]'[Text] from t
price Text SellingPrices PurchasePrices
---------------------------------------------------------------------------------------------------------------------------------------------------------
100 "Selling price 100 and purchase price 200 threshold 250" ,"100" ,"200"
200 "Selling price 200 and purchase price 300" ,"200" ,"300"
300 "blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah" () ("100";"300";"200")
如果你只想要第一个出现的情况,并且要将它们从字符串转换为数字,你可以使用first each
和"F"$
来调整更新语句(你也可以使用"I"$
来转换为整数):
q)update SellingPrices:first each "F"$findFirstWordAfterPatterns[; sellPatterns]'[Text], PurchasePrices:first each "F"$findFirstWordAfterPatterns[; buyPatterns]'[Text] from t
price Text SellingPrices PurchasePrices
----------------------------------------------------------------------------------------------------------------------------------------------------
100 "Selling price 100 and purchase price 200 threshold 250" 100 200
200 "Selling price 200 and purchase price 300" 200 300
300 "blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah" 100
英文:
you can generalize this a bit by creating a function that takes a string and a list of string patterns and returns the first word following the occurrence of any of the patterns in the string
q) findFirstWordAfterPatterns: { [txt; patterns] : { first " " vs (1 + count[string y `pattern] + y `indexes) _ x }[txt;] each ungroup flip `pattern`indexes! flip (`$patterns) ,' enlist each ss[txt;] each patterns; };
q) findFirstWordAfterPatterns["foo bar hello world foo bar"; ("foo"; "hello")]
"bar"
"bar"
"world"
you can then run an update command on your table to add two new columns that are populated by calling this function against the Text column:
q)buyPatterns: ("purchase price"; "buy price");
q)sellPatterns: enlist "Selling price"
q) t:([] price:100 200 300; Text:("Selling price 100 and purchase price 200 threshold 250";"Selling price 200 and purchase price 300"; "blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah"))
q)update SellingPrices:findFirstWordAfterPatterns[; sellPatterns]'[Text], PurchasePrices:findFirstWordAfterPatterns[; buyPatterns]'[Text] from t
price Text SellingPrices PurchasePrices
---------------------------------------------------------------------------------------------------------------------------------------------------------
100 "Selling price 100 and purchase price 200 threshold 250" ,"100" ,"200"
200 "Selling price 200 and purchase price 300" ,"200" ,"300"
300 "blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah" () ("100";"300";"200")
if you only want the first such occurrence, and you wants them converted from string to number, you can tweak the update statement using first each and "F"$ to convert to float (you could instead use "I"$ to convert to integer):
q)update SellingPrices:first each "F"$findFirstWordAfterPatterns[; sellPatterns]'[Text], PurchasePrices:first each "F"$findFirstWordAfterPatterns[; buyPatterns]'[Text] from t
price Text SellingPrices PurchasePrices
----------------------------------------------------------------------------------------------------------------------------------------------------
100 "Selling price 100 and purchase price 200 threshold 250" 100 200
200 "Selling price 200 and purchase price 300" 200 300
300 "blah blah blah purchase price 100 blah blah blah buy price 200 blah blah blah purchase price 300 blah blah blah" 100
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论