英文:
Two Letter Bigram in Pandas Dataframe
问题
Here's the translation of the code section you provided:
df['bigram'] = list(zip(df['string'], df['string'][1:]))
df['bigram'] = list(ngrams(df['string'], n=2))
df['bigram'] = re.findall(r'[a-zA-z]{2}', df['string'])
Please note that I've translated the code portions only as per your request.
英文:
Having trouble finding a way to get every two letter combination in a string in a dataframe. Everything I have been searching is for words rather than letters. Below is expected output.
stringoutputhellohe, el, ll, loworldwo, or, rl,
I have tried both lines below
df['bigram'] = list(zip(df['string'],df['string][1:]))
Generated this error
ValueError: Length of values (15570) does not match length of index (15571)
df['bigram'] = list(ngrams(df['string'], n=2))
Generated this error
ValueError: Length of values (15570) does not match length of index (15571)
df['bigram']=re.findall(r'[a-zA-z]{2}', df['string'])
Generated this error
TypeError: expected string or bytes-like object
Example:
string | output |
---|---|
hello | he, el, ll, lo |
world | wo, or, rl, ld |
答案1
得分: 0
以下是代码部分的翻译:
You need to loop over the strings:
from nltk import ngrams
df = pd.DataFrame({'string': ['abc', 'abcdef']})
df['bigram'] = df['string'].apply(lambda x: list(ngrams(x, n=2)))
Output:
string bigram
0 abc [(a, b), (b, c)]
1 abcdef [(a, b), (b, c), (c, d), (d, e), (e, f)]
If you want a string:
df['bigram'] = [', '.join([x[i:i+2] for i in range(len(x)-2)])
for x in df['string']]
Output:
string bigram
0 abc ab
1 abcdef ab, bc, cd, de
英文:
You need to loop over the strings:
from nltk import ngrams
df = pd.DataFrame({'string': ['abc', 'abcdef']})
df['bigram'] = df['string'].apply(lambda x: list(ngrams(x, n=2)))
Output:
string bigram
0 abc [(a, b), (b, c)]
1 abcdef [(a, b), (b, c), (c, d), (d, e), (e, f)]
If you want a string:
df['bigram'] = [', '.join([x[i:i+2] for i in range(len(x)-2)])
for x in df['string']]
Output:
string bigram
0 abc ab
1 abcdef ab, bc, cd, de
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论