如何在数据框列的每6个字符串中拆分字符串?

huangapple go评论93阅读模式
英文:

How to split string in every 6th strings which are the subsets of a dataframe column?

问题

在一个数据框的列中,我想将格式为字符串的子集数据拆分成每6位数字,并添加逗号',',以便我可以获得该列下的HS编码列表。我尝试了以下方法,但需要进行一些修正:

  1. df.loc[df[:, 1] for i in range(0, len(['id']), 6)

请注意,上述代码中可能存在语法错误和逻辑错误,需要进行修正。

英文:

In a dataframe column, I would like to split subset data in format of strings into every 6 digits and add a comma ',' so that I can get a list of hs codes under the column. I tried the below but it needs some correction.

  1. df.loc[df[:, 1] for i in range(0, len(['id'], 6)

答案1

得分: 0

  1. 假设你想从左边分割:

df['id'] = df['id'].astype(str).str.replace(r'(.{6})(?=.)', r'\1,', regex=True)

  1. 输出:
  1. id

0 280530,284442,284690

  1. <details>
  2. <summary>英文:</summary>
  3. Assuming you want to split from the left:

df['id'] = df['id'].astype(str).str.replace(r'(.{6})(?=.)', r'\1,', regex=True)

  1. Output:
  1. id

0 280530,284442,284690

  1. </details>
  2. # 答案2
  3. **得分**: 0
  4. **输出:**
  5. ```plaintext
  6. Column 1 id fromleft
  7. 0 a 2468938493843983 246893,849384,3983
  8. 1 b 345642232 345642,232
  9. 2 c 23343433 233434,33
英文:

Code:

  1. import pandas as pd
  2. data = {&#39;Column 1&#39;: [&#39;a&#39;, &#39;b&#39;, &#39;c&#39;],
  3. &#39;id&#39;: [2468938493843983, 345642232, 23343433]}
  4. df = pd.DataFrame(data)
  5. df[&#39;id&#39;] = df[&#39;id&#39;].astype(str)
  6. df[&#39;fromleft&#39;] = [&#39;,&#39;.join([df[&#39;id&#39;][i][j:j+6] for j in range(0, len(df[&#39;id&#39;][i]), 6)]) for i in range(len(df))]
  7. print(df)

Output:

  1. Column 1 id fromleft
  2. 0 a 2468938493843983 246893,849384,3983
  3. 1 b 345642232 345642,232
  4. 2 c 23343433 233434,33

huangapple
  • 本文由 发表于 2023年2月10日 12:39:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/75407018.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定