Twitter分页

huangapple go评论65阅读模式
英文:

Twitter Pagination

问题

我找到了这个Twitter API v2的代码示例

```python
import requests
import os
import json

# 设置你的环境变量,可以在终端中运行以下命令:
# export 'BEARER_TOKEN'='<你的bearer令牌>'
bearer_token = 'XYZ'

# 获取用户ID的URL方法
def create_url():
    # 替换成用户ID
    user_id = 14699880
    return "https://api.twitter.com/2/users/{}/tweets".format(user_id)

# 获取参数的方法
def get_params():
    return {"tweet.fields": "created_at"}

# 用于Bearer令牌认证的方法
def bearer_oauth(r):
    """
    Bearer令牌认证所需的方法。
    """
    r.headers["Authorization"] = f"Bearer {bearer_token}"
    r.headers["User-Agent"] = "v2UserTweetsPython"
    return r

# 连接到端点的方法
def connect_to_endpoint(url, params):
    response = requests.request("GET", url, auth=bearer_oauth, params=params)
    print(response.status_code)
    if response.status_code != 200:
        raise Exception(
            "请求返回了错误:{} {}".format(
                response.status_code, response.text
            )
        )
    return response.json()

# 调用上述函数的主方法
def main():
    url = create_url()
    params = get_params()
    json_response = connect_to_endpoint(url, params)
    print(json.dumps(json_response, indent=4, sort_keys=True))

if __name__ == "__main__":
    main()

问题:我只得到了10个结果?我怎样才能获得更多?与'pagination'有关吗?


<details>
<summary>英文:</summary>

I found this Twitter API v2 code sample:

   

     import requests
        import os
        import json
        
        # To set your environment variables in your terminal run the following line:
        # export &#39;BEARER_TOKEN&#39;=&#39;&lt;your_bearer_token&gt;&#39;
        bearer_token = &#39;XYZ&#39;
    

Method to get URL of user_id

    def create_url():
        # Replace with user ID below
        user_id = 14699880
        return &quot;https://api.twitter.com/2/users/{}/tweets&quot;.format(user_id)
    
Method to fetch params

    def get_params():
        # Tweet fields are adjustable.
        # Options include:
        # attachments, author_id, context_annotations,
        # conversation_id, created_at, entities, geo, id,
        # in_reply_to_user_id, lang, non_public_metrics, organic_metrics,
        # possibly_sensitive, promoted_metrics, public_metrics, referenced_tweets,
        # source, text, and withheld
        return {&quot;tweet.fields&quot;: &quot;created_at&quot;}
    
Method for bearer token authentication

    def bearer_oauth(r):
        &quot;&quot;&quot;
        Method required by bearer token authentication.
        &quot;&quot;&quot;
    
        r.headers[&quot;Authorization&quot;] = f&quot;Bearer {bearer_token}&quot;
        r.headers[&quot;User-Agent&quot;] = &quot;v2UserTweetsPython&quot;
        return r

    
Method for connecting to endpoint
   

     def connect_to_endpoint(url, params):
            response = requests.request(&quot;GET&quot;, url, auth=bearer_oauth, params=params)
            print(response.status_code)
            if response.status_code != 200:
                raise Exception(
                    &quot;Request returned an error: {} {}&quot;.format(
                        response.status_code, response.text
                    )
                )
            return response.json()
    
   Method main to call the functions above

    def main():
        url = create_url()
        params = get_params()
        json_response = connect_to_endpoint(url, params)
        print(json.dumps(json_response, indent=4, sort_keys=True))
    
    
    if __name__ == &quot;__main__&quot;:
        main()

Problem: I get only 10 results? How can I get more? Does it have to do with &#39;pagination&#39;?



</details>


# 答案1
**得分**: 2

以下是Python获取推特的许多方法。

使用`Requests`方式。

Requests(Twitter [API v2](https://developer.twitter.com/en/docs/twitter-api)):可以使用Python中的requests库直接向Twitter API发送HTTP请求。与Tweepy类似,它需要Twitter开发者API访问权限和密钥(或Bearer Token)。

这种方法可以更好地控制API调用,但需要手动处理请求和响应。

### 分页

分页是Twitter API v2端点中的一个功能,它返回的结果比单个响应中可以返回的结果多。

`next_token` - 在支持分页的端点中在meta对象响应中返回的字符串。

Twitter API中的推文分页机制允许您以分页的方式从用户或搜索查询中获取大量推文。由于API具有速率限制和每个请求的最大结果数,分页可以通过进行多次请求来获取超出这些限制的结果。

Twitter API v2使用基于游标的分页系统。当您发出API请求以获取推文时,可以使用`max_results`参数指定每页的推文数(最多100条)。如果还有更多推文可用,API响应中将包含meta字段中的`next_token`。

要获取下一页的结果,您需要使用与之前相同的参数进行另一个API请求,但这次要包括`pagination_token`参数,并将其值设置为前一个响应中的`next_token`。

更详细的信息在这里 [pagination](https://developer.twitter.com/en/docs/twitter-api/pagination)

### 演示
这是演示代码

将`<your bearer token>`替换为您的实际Bearer Token
我在`get_params()`中添加了`max_results`为100(默认为10)
这将减少总API调用次数。

保存为`get-tweets.py`文件名。

```python
import requests
import json
bearer_token = '<your bearer token>'

def create_url():
    # 将下面的用户ID替换为实际ID
    user_id = 14699880
    return "https://api.twitter.com/2/users/{}/tweets".format(user_id)

def get_params():
    return {
        "max_results": 100,
        "tweet.fields": "created_at"
    }

def bearer_oauth(bearer_token):
    headers = {"Authorization": f"Bearer {bearer_token}"}
    return headers

def connect_to_endpoint(url, params, next_token=None):
    headers = bearer_oauth(bearer_token)

    # 可以重复进行next_token的后续请求,直到获得所有(或一些)推文
    if next_token:
        params["pagination_token"] = next_token
    response = requests.request("GET", url, headers=headers, params=params)
    print(response.status_code)
    if response.status_code != 200:
        raise Exception(
            "Request returned an error: {} {}".format(
                response.status_code, response.text
            )
        )
    return response.json()

def main():
    url = create_url()
    params = get_params()
    next_token = None

    # 保持获取推文直到next_token为None
    while True:    
        json_response = connect_to_endpoint(url, params, next_token)
        print(json.dumps(json_response, indent=4, sort_keys=True))

        # 从响应中获取next_token
        if 'meta' in json_response and 'next_token' in json_response['meta']:
            next_token = json_response['meta']['next_token']
        else:
            break

if __name__ == "__main__":
    main()

运行它

python get-tweets.py

结果

Twitter分页

英文:

There are many ways to get tweets using Python.

Twitter分页

You using Requests way.

Requests (Twitter API v2): It can use the requests library in Python to make direct HTTP requests to the Twitter API. Like Tweepy, it'll need Twitter Developer API access and keys(or Bearer Token).

This method provides more control over the API calls but requires manual handling of the request and response.

Pagination

Pagination is a feature in Twitter API v2 endpoints that return more results than can be returned in a single response.

next_token - String returned within the meta-object response on endpoints that support pagination.

The tweet paging mechanism in the Twitter API allows you to fetch a large number of tweets from a user or search query in a paginated manner. Since the API has rate limits and maximum results per request, paging helps to fetch results beyond these limits by making multiple requests.

The Twitter API v2 uses a cursor-based pagination system. When you make an API request to fetch tweets, you specify the number of tweets per page (up to 100) using the max_results parameter. The API response will include a next_token in the meta field if there are more tweets available.

Twitter分页

To fetch the next page of results, you need to make another API request with the same parameters as before, but this time includes the pagination_token parameter, setting its value to the next_token from the previous response.

Twitter分页

More detailed information is here pagination

Demo

This is demo code

Replace <your bearer token> with your actual Bearer Token
I added max_results with 100 in get_params() (the default is 10)
It will reduce the number of total API calls.

Save as get-tweets.py filename.

import requests
import json
bearer_token = &#39;&lt;your bearer token&gt;&#39;

def create_url():
    # Replace with user ID below
    user_id = 14699880
    return &quot;https://api.twitter.com/2/users/{}/tweets&quot;.format(user_id)

def get_params():
    return {
        &quot;max_results&quot;: 100,
        &quot;tweet.fields&quot;: &quot;created_at&quot;
    }

def bearer_oauth(bearer_token):
    headers = {&quot;Authorization&quot;: f&quot;Bearer {bearer_token}&quot;}
    return headers

def connect_to_endpoint(url, params, next_token=None):
    headers = bearer_oauth(bearer_token)

    # next_token subsequent request can be repeated until all (or some number of) Tweets
    if next_token:
        params[&quot;pagination_token&quot;] = next_token
    response = requests.request(&quot;GET&quot;, url, headers=headers, params=params)
    print(response.status_code)
    if response.status_code != 200:
        raise Exception(
            &quot;Request returned an error: {} {}&quot;.format(
                response.status_code, response.text
            )
        )
    return response.json()

def main():
    url = create_url()
    params = get_params()
    next_token = None

    # keep getting Tweet until next_token is None
    while True:    
        json_response = connect_to_endpoint(url, params, next_token)
        print(json.dumps(json_response, indent=4, sort_keys=True))

        # assign next_token from response
        if &#39;meta&#39; in json_response and &#39;next_token&#39; in json_response[&#39;meta&#39;]:
            next_token = json_response[&#39;meta&#39;][&#39;next_token&#39;]
        else:
            break

if __name__ == &quot;__main__&quot;:
    main()

Run it

python get-tweets.py

Result

Twitter分页

huangapple
  • 本文由 发表于 2023年3月31日 02:37:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/75891842.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定