2023年6月12日 19:08:59go评论138阅读模式

英文:

How to extracts the first letters from each words in the "sym" column and creates a new column called "derived" with unique values

问题

如何从“sym”列中提取每个单词的第一个字母，并创建一个名为“derived”的新列，其中包含唯一值。

我有一个事件表，其中有大约5,000个唯一事件。现在我想生成自己的事件简称。新生成的“derived_sym”列应该是唯一的，例如，如果之前已经生成了FBS，则再次遇到FBS时应该加上1、2、3、4、5等前缀。

示例表格

//生成示例表格
n: 30;  // 表格中的行数
syms:("API Crude Oil Stock Change"; "Michigan Consumer Sentiment Final"; "Michigan Consumer Sentiment Prel"; "Inflation Rate YoY"; "FOMC Economic Projections"; "FOMC Minutes"; "Fed Barkin Speech"; "Fed Barr Testimony"; "Fed Beige Book"; "Fed Bostic Speech"; "Fed Bowman Speech"; "Fed Bullard Speech"; "Fed Chair Powell Speech"; "Fed Collins Speech"; "Fed Cook Speech"; "Fed Daly Speech"; "10-Year Note Auction");
tab:([] date: .z.d - n?30; ranks: n?100; sym: n?syms; price: "f"$n?100.5; recv_time: n?10:00:00.000 + n?10000000; is_active: "b"$n?1 0);
select by sym from tab

期望的输出如下图所示：

英文:

How to extracts the first letters from each words in the "sym" column and creates a new column called "derived" with unique values.

I have a table of events having say 5k unique events in it. Now i want to generate my own short symbology of the events. The newly generated dereived_sym column should be unique and say example if FBS we have generated previously then prefix with 1,2,3,4,5 and so on if FBS is encountered again.

Example table

//generate sample table
n: 30;  // Number of rows in the table
syms:(&quot;API Crude Oil Stock Change&quot;; &quot;Michigan Consumer Sentiment Final&quot;; &quot;Michigan Consumer Sentiment Prel&quot;; &quot;Inflation Rate YoY&quot;; &quot;FOMC Economic Projections&quot;; &quot;FOMC Minutes&quot;; &quot;Fed Barkin Speech&quot;; &quot;Fed Barr Testimony&quot;; &quot;Fed Beige Book&quot;; &quot;Fed Bostic Speech&quot;; &quot;Fed Bowman Speech&quot;; &quot;Fed Bullard Speech&quot;; &quot;Fed Chair Powell Speech&quot;; &quot;Fed Collins Speech&quot;; &quot;Fed Cook Speech&quot;; &quot;Fed Daly Speech&quot;; &quot;10-Year Note Auction&quot;);
tab:([] date: .z.d - n?30; ranks: n?100; sym: n?syms; price: &quot;f&quot;$n?100.5; recv_time: n?10:00:00.000 + n?10000000; is_active: &quot;b&quot;$n?1 0);
select by sym from tab

Desired Ouput

答案1

得分: 3

你可以根据你的规则创建一个字典：

symLookup:exec sym!derived_sym from 
    update {x,'@[;0;:;""']string til count x} derived_sym by derived_sym from 
    update derived_sym:{first each " " vs x}each sym from 
    select distinct sym from tab
q)symLookup
"Fed Cook Speech"                  | "FCS"
"Fed Collins Speech"               | "FCS1"
"Fed Bostic Speech"                | "FBS"
"Fed Bowman Speech"                | "FBS1"
"10-Year Note Auction"             | "1NA"
"Fed Bullard Speech"               | "FBS2"
"Fed Beige Book"                   | "FBB"
"Fed Chair Powell Speech"          | "FCPS"
"Michigan Consumer Sentiment Final"| "MCSF"
"API Crude Oil Stock Change"       | "ACOSC"
"FOMC Minutes"                     | "FM"
"FOMC Economic Projections"        | "FEP"
"Inflation Rate YoY"               | "IRY"
"Fed Barkin Speech"                | "FBS3"

然后在表格中创建新的列：

q)update derived_sym:symLookup sym from tab
date       ranks sym                                 price     recv_time    is_active derived_sym
-------------------------------------------------------------------------------------------------
2023.05.31 15    "Fed Cook Speech"                   55.25425  11:32:21.596 1         "FCS"
2023.05.19 84    "Fed Collins Speech"                19.68259  10:38:48.058 1         "FCS1"
2023.05.31 82    "Fed Bostic Speech"                 56.43337  12:45:25.000 1         "FBS"
2023.05.14 90    "Fed Collins Speech"                7.079031  11:51:20.492 0         "FCS1"
2023.05.18 66    "Fed Bowman Speech"                 21.34627  12:13:45.275 1         "FBS1"
2023.05.29 2     "Fed Cook Speech"                   78.17714  11:30:50.872 0         "FCS"
2023.06.12 96    "Fed Cook Speech"                   48.68951  12:31:39.330 1         "FCS"
2023.06.01 93    "10-Year Note Auction"              68.62139  10:36:40.212 0         "1NA"
2023.05.26 5     "Fed Bullard Speech"                15.39931  11:33:14.916 1         "FBS2"
2023.05.31 58    "Fed Beige Book"                    53.77677  12:13:45.275 0         "FBB"
2023.05.26 31    "Fed Beige Book"                    45.96147  11:05:42.696 1         "FBB"
2023.06.01 7     "Fed Bowman Speech"                 0.8102834 11:05:42.696 0         "FBS1"
2023.05.18 53    "Fed Chair Powell Speech"           10.4454   12:34:35.078 0         "FCPS"
2023.05.28 38    "Fed Bullard Speech"                10.49734  12:31:11.038 1         "FBS2"
2023.06.01 23    "Michigan Consumer Sentiment Final" 33.96998  12:34:35.078 1         "MCSF"
2023.05.29 27    "API Crude Oil Stock Change"        48.85854  12:13:45.275 0         "ACOSC"
2023.06.06 32    "FOMC Minutes"                      48.83224  10:53:26.221 1         "FM"
2023.06.03 82    "API Crude Oil Stock Change"        98.46267  12:34:35.078 0         "ACOSC"

英文:

You can create a dictionary based on your rule:

q)symLookup:exec sym!derived_sym from 
    update {x,&#39;@[;0;:;&quot;&quot;]string til count x} derived_sym by derived_sym from 
    update derived_sym:{first each &quot; &quot; vs x}each sym from 
    select distinct sym from tab
q)symLookup
&quot;Fed Cook Speech&quot;                  | &quot;FCS&quot;
&quot;Fed Collins Speech&quot;               | &quot;FCS1&quot;
&quot;Fed Bostic Speech&quot;                | &quot;FBS&quot;
&quot;Fed Bowman Speech&quot;                | &quot;FBS1&quot;
&quot;10-Year Note Auction&quot;             | &quot;1NA&quot;
&quot;Fed Bullard Speech&quot;               | &quot;FBS2&quot;
&quot;Fed Beige Book&quot;                   | &quot;FBB&quot;
&quot;Fed Chair Powell Speech&quot;          | &quot;FCPS&quot;
&quot;Michigan Consumer Sentiment Final&quot;| &quot;MCSF&quot;
&quot;API Crude Oil Stock Change&quot;       | &quot;ACOSC&quot;
&quot;FOMC Minutes&quot;                     | &quot;FM&quot;
&quot;FOMC Economic Projections&quot;        | &quot;FEP&quot;
&quot;Inflation Rate YoY&quot;               | &quot;IRY&quot;
&quot;Fed Barkin Speech&quot;                | &quot;FBS3&quot;

And then create the new column in the table:

q)update derived_sym:symLookup sym from tab
date       ranks sym                                 price     recv_time    is_active derived_sym
-------------------------------------------------------------------------------------------------
2023.05.31 15    &quot;Fed Cook Speech&quot;                   55.25425  11:32:21.596 1         &quot;FCS&quot;
2023.05.19 84    &quot;Fed Collins Speech&quot;                19.68259  10:38:48.058 1         &quot;FCS1&quot;
2023.05.31 82    &quot;Fed Bostic Speech&quot;                 56.43337  12:45:25.000 1         &quot;FBS&quot;
2023.05.14 90    &quot;Fed Collins Speech&quot;                7.079031  11:51:20.492 0         &quot;FCS1&quot;
2023.05.18 66    &quot;Fed Bowman Speech&quot;                 21.34627  12:13:45.275 1         &quot;FBS1&quot;
2023.05.29 2     &quot;Fed Cook Speech&quot;                   78.17714  11:30:50.872 0         &quot;FCS&quot;
2023.06.12 96    &quot;Fed Cook Speech&quot;                   48.68951  12:31:39.330 1         &quot;FCS&quot;
2023.06.01 93    &quot;10-Year Note Auction&quot;              68.62139  10:36:40.212 0         &quot;1NA&quot;
2023.05.26 5     &quot;Fed Bullard Speech&quot;                15.39931  11:33:14.916 1         &quot;FBS2&quot;
2023.05.31 58    &quot;Fed Beige Book&quot;                    53.77677  12:13:45.275 0         &quot;FBB&quot;
2023.05.26 31    &quot;Fed Beige Book&quot;                    45.96147  11:05:42.696 1         &quot;FBB&quot;
2023.06.01 7     &quot;Fed Bowman Speech&quot;                 0.8102834 11:05:42.696 0         &quot;FBS1&quot;
2023.05.18 53    &quot;Fed Chair Powell Speech&quot;           10.4454   12:34:35.078 0         &quot;FCPS&quot;
2023.05.28 38    &quot;Fed Bullard Speech&quot;                10.49734  12:31:11.038 1         &quot;FBS2&quot;
2023.06.01 23    &quot;Michigan Consumer Sentiment Final&quot; 33.96998  12:34:35.078 1         &quot;MCSF&quot;
2023.05.29 27    &quot;API Crude Oil Stock Change&quot;        48.85854  12:13:45.275 0         &quot;ACOSC&quot;
2023.06.06 32    &quot;FOMC Minutes&quot;                      48.83224  10:53:26.221 1         &quot;FM&quot;
2023.06.03 82    &quot;API Crude Oil Stock Change&quot;        98.46267  12:34:35.078 0         &quot;ACOSC&quot;

答案2

得分: 2

从 "10-Year Note Auction" 变成 "YNA"，我假设我们首先需要删除所有不是字母或空格的字符。
分两步进行，首先生成缩写，然后添加数字后缀：
```q
tab2:update derived_sym:first each/:&quot; &quot;vs/:sym inter\:(&quot; &quot;,.Q.A,.Q.a) from tab
update derived_sym:{0N!x,&#39;enlist[&quot;&quot;],string 1+til count[x]-1}derived_sym by derived_sym from tab2

英文:

From how "10-Year Note Auction" becomes "YNA" I'm assuming we first need to delete all characters that are not alphabetic or spaces.

Doing it in two steps, first generating the abbreviations and then adding the number suffixes:

tab2:update derived_sym:first each/:&quot; &quot;vs/:sym inter\:(&quot; &quot;,.Q.A,.Q.a) from tab
update derived_sym:{0N!x,&#39;enlist[&quot;&quot;],string 1+til count[x]-1}derived_sym by derived_sym from tab2
date       ranks sym                                 price     recv_time    is_active derived_sym
-------------------------------------------------------------------------------------------------
2023.05.31 15    &quot;Fed Cook Speech&quot;                   55.25425  11:32:21.596 1         &quot;FCS&quot;
2023.05.19 84    &quot;Fed Collins Speech&quot;                19.68259  10:38:48.058 1         &quot;FCS1&quot;
2023.05.31 82    &quot;Fed Bostic Speech&quot;                 56.43337  12:45:25.000 1         &quot;FBS&quot;
2023.05.14 90    &quot;Fed Collins Speech&quot;                7.079031  11:51:20.492 0         &quot;FCS2&quot;
2023.05.18 66    &quot;Fed Bowman Speech&quot;                 21.34627  12:13:45.275 1         &quot;FBS1&quot;
2023.05.29 2     &quot;Fed Cook Speech&quot;                   78.17714  11:30:50.872 0         &quot;FCS3&quot;
2023.06.12 96    &quot;Fed Cook Speech&quot;                   48.68951  12:31:39.330 1         &quot;FCS4&quot;
2023.06.01 93    &quot;10-Year Note Auction&quot;              68.62139  10:36:40.212 0         &quot;YNA&quot;
...

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How to extracts the first letters from each words in the "sym" column and creates a new column called "derived" with unique values

问题

答案1

答案2

连接 kdb q 中的 hdb 数据库到 .q 脚本的方法是？

比较kdb中的两个符号字段。

convert time to based on each timezone in KDB+

Get name of function that called current function

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。