英文:
'Columns must be same length as key' error when trying .Split
问题
以下是您提供的代码的正确输出(在Python 3.8中运行的输出):
| | Name | Symbol | 1h | 24h | MarketCap |
|---:|:------------|:----------|:-------|:---------|:------------|
| 3 | Shrekt |4HREK | 23.82% | 2536.51% | 342,357 |
| 8 | BLAZE |TOKEN9BLZE | 1.07% | 106.71% | 3,828,088 |
| 26 | Goner27 |GONER | 6.32% | 88.09% | 1,094,010 |
| 14 | Party Hat15 |PHAT | 13.34% | 81.64% | 60,136 |
| 29 | PepeChat |30PPC | 48.01% | 78.25% | 431,159 |
请注意,这是在Python 3.8中运行时的正确输出。如果在Python 3.10中出现问题,可能需要查看Python 3.10的新特性和更改,以确定问题的根本原因。
英文:
The code below just runs fine with Python 3.8.10 but does not run in Python 3.10. Any idea what could be the problem?
import pandas as pd
import requests
url = "https://coinmarketcap.com/new/"
page = requests.get(url,headers={'User-Agent': 'Mozilla/5.0'}, timeout=1)
pagedata = page.text
usecols = ["Name", "Symbol", "1h", "24h", "MarketCap"]
df = pd.read_html(page.text)[0]
df[["Name", "Symbol"]] = df["Name"].str.split(r"\d+", expand=True)
df = (df.rename(columns={"Fully Diluted Market Cap": "MarketCap"})[usecols]
.sort_values("24h", ascending=False, key=lambda ser: ser.str.replace("%", "").astype(float))
.replace(r"^$", "", regex=True)
)
numcols = df.columns[~df.columns.isin(['Name'])]
df = df.head(5).to_markdown(index=True)
print (df)
Current Output:
Traceback (most recent call last):
df[["Name", "Symbol"]] = df["Name"].str.split(r"\d+", expand=True)
....
....
ValueError: Columns must be same length as key
Correct Output: (Output in Python 3.8)
| | Name | Symbol | 1h | 24h | MarketCap |
|---:|:------------|:----------|:-------|:---------|:------------|
| 3 | Shrekt |4HREK | 23.82% | 2536.51% | 342,357 |
| 8 | BLAZE |TOKEN9BLZE | 1.07% | 106.71% | 3,828,088 |
| 26 | Goner27 |GONER | 6.32% | 88.09% | 1,094,010 |
| 14 | Party Hat15 |PHAT | 13.34% | 81.64% | 60,136 |
| 29 | PepeChat |30PPC | 48.01% | 78.25% | 431,159 |
答案1
得分: 3
以下是您要翻译的内容:
"我认为这与在列“Name”中找到的一个值(NOOT (BRC-20)4NOOT
)有关。
为了处理这个问题,我们可以尝试在该列的每一行中分割出最后一个数字。
将此替换为:
df[["Name", "Symbol"]] = df["Name"].str.split(r"\d+(?!.*\d)", expand=True)
> 正则表达式演示:demo
输出:
print(df)
| | Name | Symbol | 1h | 24h | MarketCap |
|---:|:------------|:---------|:-------|:---------|:------------|
| 5 | Shrekt | HREK | 54.61% | 1124.57% | 159,013 |
| 10 | BLAZE TOKEN | BLZE | 2.40% | 109.53% | 3,880,242 |
| 8 | CMC DOGE | CMCDOGE | 12.93% | 102.76% | 169,492 |
| 28 | Goner | GONER | 1.37% | 88.66% | 1,050,089 |
| 4 | nomeme | NOMEME | 53.86% | 86.14% | 4,603,393 |"
英文:
I think it has to do with one of the values (NOOT (BRC-20)4NOOT
) found in the column Name
.
To handle this, we can try to split on the last number found in each row of this column.
Replace this :
df[["Name", "Symbol"]] = df["Name"].str.split(r"\d+", expand=True)
By this :
df[["Name", "Symbol"]] = df["Name"].str.split(r"\d+(?!.*\d)", expand=True)
> Regex [demo]
Output :
print(df)
| | Name | Symbol | 1h | 24h | MarketCap |
|---:|:------------|:---------|:-------|:---------|:------------|
| 5 | Shrekt | HREK | 54.61% | 1124.57% | 159,013 |
| 10 | BLAZE TOKEN | BLZE | 2.40% | 109.53% | 3,880,242 |
| 8 | CMC DOGE | CMCDOGE | 12.93% | 102.76% | 169,492 |
| 28 | Goner | GONER | 1.37% | 88.66% | 1,050,089 |
| 4 | nomeme | NOMEME | 53.86% | 86.14% | 4,603,393 |
答案2
得分: 2
你可以在函数之后创建一个新的数据框并使用它,不一定要继续使用"df"作为数据框名称。尝试这样做:
newDF = df["Name"].str.split(r"\d+", expand=True)
print(newDF)
修正后的代码:
df["Name"] = df["Name"].str.replace("\(BRC-20\)","")
将这行代码添加到你的代码中,它会替换掉任何包含"(BRC-20)"的部分。所以问题并不是关于你的Python版本。
英文:
Do you have to use keep using df as dataframe. I think you can create a new one and use that one after the function. Try just doing this
newDF = df["Name"].str.split(r"\d+", expand=True)
print(newDF)
Edit Fixed Code:
df["Name"] = df["Name"].str.replace("\(BRC-20\)","")
add this line to your code which will replace anything that has (BRC-20) in it.
So the problem wasn't being about the version of your python.
答案3
得分: 1
我认为你需要按照列'#'的值来拆分名称。你可以这样做:
创建一个用于拆分的函数:
def splitting(num, strng):
splitted = strng.split(str(num))
print(num)
return [splitted[0], splitted[1]]
然后应用这个函数并将新列分解成新的列:
df["split"] = df.apply(lambda x: splitting(x['#'], x['Name']), axis=1)
df[['OnlyName','Symbol']] = pd.DataFrame(df.split.tolist(), index=df.index)
如果有帮助,请告诉我。
英文:
I think you need to split the name by the value of columns '#'.
You can do it like this:
Create a function for splitting:
def splitting(num, strng):
splitted = strng.split(str(num))
print (num)
return [splitted[0], splitted[1]]
Then apply the function and explode the new column into new columns:
df["split"] = df.apply(lambda x: splitting(x['#'], x['Name']), axis=1)
df[['OnlyName','Symbol']] = pd.DataFrame(df.split.tolist(), index= df.index)
Let me know if this helps.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论