2023年5月11日 17:22:07go评论72阅读模式

英文:

'Columns must be same length as key' error when trying .Split

问题

以下是您提供的代码的正确输出（在Python 3.8中运行的输出）：

|    | Name        | Symbol    | 1h     | 24h      | MarketCap   |
|---:|:------------|:----------|:-------|:---------|:------------|
|  3 | Shrekt      |4HREK      | 23.82% | 2536.51% | 342,357     |
|  8 | BLAZE       |TOKEN9BLZE | 1.07%  | 106.71%  | 3,828,088   |
| 26 | Goner27     |GONER      | 6.32%  | 88.09%   | 1,094,010   |
| 14 | Party Hat15 |PHAT       | 13.34% | 81.64%   | 60,136      |
| 29 | PepeChat    |30PPC      | 48.01% | 78.25%   | 431,159     |

请注意，这是在Python 3.8中运行时的正确输出。如果在Python 3.10中出现问题，可能需要查看Python 3.10的新特性和更改，以确定问题的根本原因。

英文:

The code below just runs fine with Python 3.8.10 but does not run in Python 3.10. Any idea what could be the problem?

import pandas as pd
import requests

url = &quot;https://coinmarketcap.com/new/&quot;
page = requests.get(url,headers={&#39;User-Agent&#39;: &#39;Mozilla/5.0&#39;}, timeout=1)
pagedata = page.text
usecols = [&quot;Name&quot;, &quot;Symbol&quot;, &quot;1h&quot;, &quot;24h&quot;, &quot;MarketCap&quot;]


df = pd.read_html(page.text)[0]
df[[&quot;Name&quot;, &quot;Symbol&quot;]] = df[&quot;Name&quot;].str.split(r&quot;\d+&quot;, expand=True)

df = (df.rename(columns={&quot;Fully Diluted Market Cap&quot;: &quot;MarketCap&quot;})[usecols]
          .sort_values(&quot;24h&quot;, ascending=False, key=lambda ser: ser.str.replace(&quot;%&quot;, &quot;&quot;).astype(float))
          .replace(r&quot;^$&quot;, &quot;&quot;, regex=True)
     )

numcols = df.columns[~df.columns.isin([&#39;Name&#39;])]
df = df.head(5).to_markdown(index=True)
print (df)

Current Output:

Traceback (most recent call last):
  df[[&quot;Name&quot;, &quot;Symbol&quot;]] = df[&quot;Name&quot;].str.split(r&quot;\d+&quot;, expand=True)
  ....
  ....
  ValueError: Columns must be same length as key

Correct Output: (Output in Python 3.8)

|    | Name        | Symbol    | 1h     | 24h      | MarketCap   |
|---:|:------------|:----------|:-------|:---------|:------------|
|  3 | Shrekt      |4HREK      | 23.82% | 2536.51% | 342,357     |
|  8 | BLAZE       |TOKEN9BLZE | 1.07%  | 106.71%  | 3,828,088   |
| 26 | Goner27     |GONER      | 6.32%  | 88.09%   | 1,094,010   |
| 14 | Party Hat15 |PHAT       | 13.34% | 81.64%   | 60,136      |
| 29 | PepeChat    |30PPC      | 48.01% | 78.25%   | 431,159     |

答案1

得分: 3

以下是您要翻译的内容：

"我认为这与在列“Name”中找到的一个值(NOOT (BRC-20)4NOOT)有关。

为了处理这个问题，我们可以尝试在该列的每一行中分割出最后一个数字。

将此替换为：

df[[&quot;Name&quot;, &quot;Symbol&quot;]] = df[&quot;Name&quot;].str.split(r&quot;\d+(?!.*\d)&quot;, expand=True)

> 正则表达式演示：demo

输出：

print(df)

|    | Name        | Symbol   | 1h     | 24h      | MarketCap   |
|---:|:------------|:---------|:-------|:---------|:------------|
|  5 | Shrekt      | HREK     | 54.61% | 1124.57% | 159,013     |
| 10 | BLAZE TOKEN | BLZE     | 2.40%  | 109.53%  | 3,880,242   |
|  8 | CMC DOGE    | CMCDOGE  | 12.93% | 102.76%  | 169,492     |
| 28 | Goner       | GONER    | 1.37%  | 88.66%   | 1,050,089   |
|  4 | nomeme      | NOMEME   | 53.86% | 86.14%   | 4,603,393   |"

英文:

I think it has to do with one of the values (NOOT (BRC-20)4NOOT) found in the column Name.

To handle this, we can try to split on the last number found in each row of this column.

Replace this :

df[[&quot;Name&quot;, &quot;Symbol&quot;]] = df[&quot;Name&quot;].str.split(r&quot;\d+&quot;, expand=True)

By this :

df[[&quot;Name&quot;, &quot;Symbol&quot;]] = df[&quot;Name&quot;].str.split(r&quot;\d+(?!.*\d)&quot;, expand=True)

> Regex [demo]

Output :

print(df)

|    | Name        | Symbol   | 1h     | 24h      | MarketCap   |
|---:|:------------|:---------|:-------|:---------|:------------|
|  5 | Shrekt      | HREK     | 54.61% | 1124.57% | 159,013     |
| 10 | BLAZE TOKEN | BLZE     | 2.40%  | 109.53%  | 3,880,242   |
|  8 | CMC DOGE    | CMCDOGE  | 12.93% | 102.76%  | 169,492     |
| 28 | Goner       | GONER    | 1.37%  | 88.66%   | 1,050,089   |
|  4 | nomeme      | NOMEME   | 53.86% | 86.14%   | 4,603,393   |

答案2

得分: 2

你可以在函数之后创建一个新的数据框并使用它，不一定要继续使用"df"作为数据框名称。尝试这样做：

newDF = df["Name"].str.split(r"\d+", expand=True)
print(newDF)

修正后的代码：

df["Name"] = df["Name"].str.replace("\(BRC-20\)","")

将这行代码添加到你的代码中，它会替换掉任何包含"(BRC-20)"的部分。所以问题并不是关于你的Python版本。

英文:

Do you have to use keep using df as dataframe. I think you can create a new one and use that one after the function. Try just doing this

newDF = df[&quot;Name&quot;].str.split(r&quot;\d+&quot;, expand=True)
print(newDF)

Edit Fixed Code:

df[&quot;Name&quot;] = df[&quot;Name&quot;].str.replace(&quot;\(BRC-20\)&quot;,&quot;&quot;)

add this line to your code which will replace anything that has (BRC-20) in it.
So the problem wasn't being about the version of your python.

答案3

得分: 1

我认为你需要按照列'#'的值来拆分名称。你可以这样做：

创建一个用于拆分的函数：

def splitting(num, strng):
    splitted = strng.split(str(num))
    print(num)
    return [splitted[0], splitted[1]]

然后应用这个函数并将新列分解成新的列：

df["split"] = df.apply(lambda x: splitting(x['#'], x['Name']), axis=1)
df[['OnlyName','Symbol']] = pd.DataFrame(df.split.tolist(), index=df.index)

如果有帮助，请告诉我。

英文:

I think you need to split the name by the value of columns '#'.
You can do it like this:

Create a function for splitting:

def splitting(num, strng):
    splitted = strng.split(str(num))
    print (num)
    return [splitted[0], splitted[1]]

Then apply the function and explode the new column into new columns:

df[&quot;split&quot;] = df.apply(lambda x: splitting(x[&#39;#&#39;], x[&#39;Name&#39;]), axis=1)
df[[&#39;OnlyName&#39;,&#39;Symbol&#39;]] = pd.DataFrame(df.split.tolist(), index= df.index)

Let me know if this helps.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

‘Columns must be same length as key’ error when trying .Split

问题

答案1

答案2

答案3

Python Image/Button click via selenium

Java无法运行较大的Python文件。

不正确比较列表

Python: 在数据框中去除重复的小数值

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论