问题

End user can copy tables from a pdf like

, paste the text in openai playground

bird_id bird_posts bird_likes
012 2 5
013 0 4
056 57 70
612 0 12

and will prompt the gpt with "Create table with the given text"
and gpt generates a table like below:

This works well as expected.
But when my input text is sizeable (say 1076 tokens), I face the following error:

Token limit error: The input tokens exceeded the maximum allowed by the model. Please reduce the number of input tokens to continue. Refer to the token count in the 'Parameters' panel for more details.

I will use python for text preprocessing and will get the data from UI.
If my input is textual data (like passages), I can use the approaches suggested by Langchain.
But, I would not be able to use summarization iteratively with tabular text as I might loose rows/columns.

Any inputs how this can be handled?

英文:

End user can copy tables from a pdf like

, paste the text in openai playground

bird_id bird_posts bird_likes
012 2 5
013 0 4
056 57 70
612 0 12

and will prompt the gpt with "Create table with the given text"
and gpt generates a table like below:

This works well as expected.
But when my input text is sizeable (say 1076 tokens), I face the following error:

Token limit error: The input tokens exceeded the maximum allowed by the model. Please reduce the number of input tokens to continue. Refer to the token count in the &#39;Parameters&#39; panel for more details.

Any inputs how this can be handled?

答案1

得分: 1

一般来说，这不能解决任何表格大小的问题 - 这些模型只有有限的上下文长度，这是它们的硬性限制。
据我所知，这是当前积极研究的课题，例如：https://arxiv.org/abs/2304.11062（但这并未在OpenAI方面实现，并且有其自己的限制和困难）。

您可以尝试新的gpt-3.5-turbo-16k模型，其上下文大小为16384个标记（与您似乎使用的gpt-3.5-turbo模型的4096个标记相比）。

英文:

In general this cannot be solved for any table size - those models just have limitted context length and that's their hard limitation.
As far as I know this is a subject of active research currently, for example: https://arxiv.org/abs/2304.11062 (but this is not implemented in OpenAI side and has it's own limitations and difficulties).

You can try the new gpt-3.5-turbo-16k model, which has context size of 16384 tokens (as compared to 4096 tokens for gpt-3.5-turbo, which you seem to use).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在创建表格时处理ChatGPT3.5 Turbo中的令牌限制？

问题

答案1

xarray在乘法数据数组时的行为是什么？

SciPy LinearNDInterpolator RegularGridInterpolator对相同数据产生不同的结果。

“Kivy – self not defined”翻译为”Kivy – self未定义”。

Python：网络爬虫 Pandas 数据框在数据之间返回多个空行

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论