英文:
Why is request returning must provide query string when scraped?
问题
这是我的当前代码:
dcID = "RGVsaXZlcnlDb25maWc6VGh1cnNkYXksIDA5IEZlYnJ1YXJ5IDIwMjN8SkswMXxTRDI5fGZhbHNl"
slugcat = "vegetables-1-a0d03d59"
url = "https://www.sayurbox.com/graphql/v1?deduplicate=1"
payload = {
"operationName": "getCartItemCount",
"variables": {"deliveryConfigId": DCId},
"query": "query getCartItemCount($deliveryConfigId: ID!) {\n cart(deliveryConfigId: $deliveryConfigId) {\n id\n count\n __typename\n }\n}"
}, {
"operationName": "getProducts",
"variables": {"deliveryConfigId": DCId, "sortBy": "related_product", "isInstantDelivery": False, "slug": slugcat, "first": 12, "abTestFeatures": []},
"query": "query getProducts($deliveryConfigId: ID!, $sortBy: CatalogueSortType!, $slug: String!, $after: String, $first: Int, $isInstantDelivery: Boolean, $abTestFeatures: [String!]) {\n productsByCategoryOrSubcategoryAndDeliveryConfig(\n deliveryConfigId: $deliveryConfigId\n sortBy: $sortBy\n slug: $slug\n after: $after\n first: $first\n isInstantDelivery: $isInstantDelivery\n abTestFeatures: $abTestFeatures\n ) {\n edges {\n node {\n ...ProductInfoFragment\n __typename\n }\n __typename\n }\n pageInfo {\n hasNextPage\n endCursor\n __typename\n }\n productBuilder\n __typename\n }\n}\n\nfragment ProductInfoFragment on Product {\n id\n uuid\n deliveryConfigId\n displayName\n priceRanges\n priceMin\n priceMax\n actualPriceMin\n actualPriceMax\n slug\n label\n isInstant\n isInstantOnly\n nextDayAvailability\n heroImage\n promo\n discount\n isDiscount\n variantType\n imageIds\n isStockAvailable\n defaultVariantSkuCode\n quantitySoldFormatted\n promotion {\n quota\n isShown\n campaignId\n __typename\n }\n productVariants {\n productVariant {\n id\n skuCode\n variantName\n maxQty\n isDiscount\n stockAvailable\n promotion {\n quota\n campaignId\n isShown\n __typename\n }\n __typename\n }\n pageInfo {\n hasPreviousPage\n hasNextPage\n __typename\n }\n __typename\n }\n __typename\n}"
}
response = requests.get(url, headers=headers, json=payload)
response.json()
响应返回如下:
[{'errors': [{'message': 'Must provide query string.', 'extensions': {'timestamp': 1675842901472}}]},
{'errors': [{'message': 'Must provide query string.', 'extensions': {'timestamp': 1675842901472}}]}
]
我不确定我哪里出错了,因为我已经精确复制了负载和标头。有人可以帮助吗?
英文:
Here's my current code:
dcID="RGVsaXZlcnlDb25maWc6VGh1cnNkYXksIDA5IEZlYnJ1YXJ5IDIwMjN8SkswMXxTRDI5fGZhbHNl"
slugcat="vegetables-1-a0d03d59"
url="https://www.sayurbox.com/graphql/v1?deduplicate=1"
payload={"operationName":"getCartItemCount",
"variables":{"deliveryConfigId":DCId},
"query":"query getCartItemCount($deliveryConfigId: ID!) {\n cart(deliveryConfigId: $deliveryConfigId) {\n id\n count\n __typename\n }\n}"},{"operationName":"getProducts",
"variables":{"deliveryConfigId":DCId,
"sortBy":"related_product",
"isInstantDelivery":False,
"slug":slugcat,
"first":12,
"abTestFeatures":[]},
"query":"query getProducts($deliveryConfigId: ID!, $sortBy: CatalogueSortType!, $slug: String!, $after: String, $first: Int, $isInstantDelivery: Boolean, $abTestFeatures: [String!]) {\n productsByCategoryOrSubcategoryAndDeliveryConfig(\n deliveryConfigId: $deliveryConfigId\n sortBy: $sortBy\n slug: $slug\n after: $after\n first: $first\n isInstantDelivery: $isInstantDelivery\n abTestFeatures: $abTestFeatures\n ) {\n edges {\n node {\n ...ProductInfoFragment\n __typename\n }\n __typename\n }\n pageInfo {\n hasNextPage\n endCursor\n __typename\n }\n productBuilder\n __typename\n }\n}\n\nfragment ProductInfoFragment on Product {\n id\n uuid\n deliveryConfigId\n displayName\n priceRanges\n priceMin\n priceMax\n actualPriceMin\n actualPriceMax\n slug\n label\n isInstant\n isInstantOnly\n nextDayAvailability\n heroImage\n promo\n discount\n isDiscount\n variantType\n imageIds\n isStockAvailable\n defaultVariantSkuCode\n quantitySoldFormatted\n promotion {\n quota\n isShown\n campaignId\n __typename\n }\n productVariants {\n productVariant {\n id\n skuCode\n variantName\n maxQty\n isDiscount\n stockAvailable\n promotion {\n quota\n campaignId\n isShown\n __typename\n }\n __typename\n }\n pageInfo {\n hasPreviousPage\n hasNextPage\n __typename\n }\n __typename\n }\n __typename\n}"}
response=requests.get(url,headers=headers,json=payload)
response.json()
The response returns
[{'errors': [{'message': 'Must provide query string.',
'extensions': {'timestamp': 1675842901472}}]},
{'errors': [{'message': 'Must provide query string.',
'extensions': {'timestamp': 1675842901472}}]}]
I am not sure where I went wrong, as I've copied the payload and headers exactly. Can someone help?
答案1
得分: 1
Get请求通常不应该有负载。我认为这只是你尝试提供的查询参数。尝试将json
参数更改为params
。https://www.w3schools.com/python/ref_requests_get.asp
英文:
Get requests generally shouldn't have a payload. I think these are just query parameters you're trying to supply. Try changing the <strike>payload </strike> json
argument to params
. https://www.w3schools.com/python/ref_requests_get.asp
答案2
得分: 1
首先,请求应该是POST而不是GET。其次,我认为您不应该使用"getCartItemCount",而应该使用"getProducts"。
DCId = 'RGVsaXZlcnlDb25maWc6VGh1cnNkYXksIDA5IEZlYnJ1YXJ5IDIwMjN8SkswMXxTRDI5fGZhbHNl'
slugcat = 'vegetables-1-a0d03d59'
url = 'https://www.sayurbox.com/graphql/v1?deduplicate=1'
payload = {
'operationName': 'getProducts',
'variables': {
'deliveryConfigId': DCId,
'sortBy': 'related_product',
'isInstantDelivery': False,
'slug': slugcat,
'first': 12,
'abTestFeatures': ['category-page-subcategory-section-v5#####control']
},
'query': 'query getProducts($deliveryConfigId: ID!, $sortBy: CatalogueSortType!, $slug: String!, $after: String, $first: Int, $isInstantDelivery: Boolean, $abTestFeatures: [String!]) { productsByCategoryOrSubcategoryAndDeliveryConfig( deliveryConfigId: $deliveryConfigId sortBy: $sortBy slug: $slug after: $after first: $first isInstantDelivery: $isInstantDelivery abTestFeatures: $abTestFeatures ) { edges { node { ...ProductInfoFragment __typename } __typename } pageInfo { hasNextPage endCursor __typename } productBuilder __typename }}fragment ProductInfoFragment on Product { id uuid deliveryConfigId displayName priceRanges priceMin priceMax actualPriceMin actualPriceMax slug label isInstant isInstantOnly nextDayAvailability heroImage promo discount isDiscount variantType imageIds isStockAvailable defaultVariantSkuCode quantitySoldFormatted promotion { quota isShown campaignId __typename } productVariants { productVariant { id skuCode variantName maxQty isDiscount stockAvailable promotion { quota campaignId isShown __typename } __typename } pageInfo { hasPreviousPage hasNextPage __typename } __typename}'
}
response = requests.post(url, headers=headers, json=payload1)
data = response.json()
输出(使用Pandas):
import pandas as pd
df = pd.json_normalize([node['node'] for node in data['data']['productsByCategoryOrSubcategoryAndDeliveryConfig']['edges']])
>>> df
id uuid ... productVariants.pageInfo.__typename productVariants.__typename
0 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 479c7805-3b26-4bb9-93b9-5689a2d3bb9d ... PageInfo productVariantsConnection
1 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... ba7154a1-e784-451d-88e0-10ede13d55b3 ... PageInfo productVariantsConnection
2 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 5e023650-50fa-4adc-800d-be14cac7f1eb ... PageInfo productVariantsConnection
3 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... eec5c6fa-70b9-45d8-a316-6820d1ed68c3 ... PageInfo productVariantsConnection
4 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... ee1a0910-f021-48e4-a8d0-ab54f4358bde ... PageInfo productVariantsConnection
5 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 17dccf7a-0763-4c34-a537-7b746bdba683 ... PageInfo productVariantsConnection
6 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 90bbee6d-184e-4d8b-8702-77b660883a00 ... PageInfo productVariantsConnection
7 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... f7e51319-0dd3-4c21-9bba-bc8e3f71db94 ... PageInfo productVariantsConnection
8 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 9f889a62-9302-48db-a972-cff035440ee4 ... PageInfo productVariantsConnection
9 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... dd58f053-238f-45f6-b937-687c1e1db3b0 ... PageInfo productVariantsConnection
10 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 05c37b4e-cf0f-4cf5-a9a8-20ea00029063 ... PageInfo productVariantsConnection
11 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... e559850a-2344-4bb4-be70-932214aace91 ... PageInfo productVariantsConnection
[12 rows x 30 columns]
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12 entries, 0 to 11
Data columns (total 30 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 12 non-null object
1 uuid 12 non-null object
2 deliveryConfigId 12 non-null object
3 displayName 12 non-null object
4 priceRanges 12 non-null object
5 priceMin 12 non-null
<details>
<summary>英文:</summary>
First, the request should be a POST and not a GET. Second thing, I think you don't want to operate on "getCartItemCount" but probably on "getProducts".
DCId = 'RGVsaXZlcnlDb25maWc6VGh1cnNkYXksIDA5IEZlYnJ1YXJ5IDIwMjN8SkswMXxTRDI5fGZhbHNl'
slugcat = 'vegetables-1-a0d03d59'
url = 'https://www.sayurbox.com/graphql/v1?deduplicate=1'
payload = {
'operationName': 'getProducts',
'variables': {
'deliveryConfigId': DCId,
'sortBy': 'related_product',
'isInstantDelivery': False,
'slug': slugcat,
'first': 12,
'abTestFeatures': ['category-page-subcategory-section-v5#####control']
},
'query': 'query getProducts($deliveryConfigId: ID!, $sortBy: CatalogueSortType!, $slug: String!, $after: String, $first: Int, $isInstantDelivery: Boolean, $abTestFeatures: [String!]) { productsByCategoryOrSubcategoryAndDeliveryConfig( deliveryConfigId: $deliveryConfigId sortBy: $sortBy slug: $slug after: $after first: $first isInstantDelivery: $isInstantDelivery abTestFeatures: $abTestFeatures ) { edges { node { ...ProductInfoFragment __typename } __typename } pageInfo { hasNextPage endCursor __typename } productBuilder __typename }}fragment ProductInfoFragment on Product { id uuid deliveryConfigId displayName priceRanges priceMin priceMax actualPriceMin actualPriceMax slug label isInstant isInstantOnly nextDayAvailability heroImage promo discount isDiscount variantType imageIds isStockAvailable defaultVariantSkuCode quantitySoldFormatted promotion { quota isShown campaignId __typename } productVariants { productVariant { id skuCode variantName maxQty isDiscount stockAvailable promotion { quota campaignId isShown __typename } __typename } pageInfo { hasPreviousPage hasNextPage __typename } __typename } __typename}'}
response = requests.post(url, headers=headers, json=payload1)
data = response.json()
Output (with Pandas):
import pandas as pd
df = pd.json_normalize([node['node'] for node in data['data']['productsByCategoryOrSubcategoryAndDeliveryConfig']['edges']])
>>> df
id uuid ... productVariants.pageInfo.__typename productVariants.__typename
0 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 479c7805-3b26-4bb9-93b9-5689a2d3bb9d ... PageInfo productVariantsConnection
1 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... ba7154a1-e784-451d-88e0-10ede13d55b3 ... PageInfo productVariantsConnection
2 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 5e023650-50fa-4adc-800d-be14cac7f1eb ... PageInfo productVariantsConnection
3 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... eec5c6fa-70b9-45d8-a316-6820d1ed68c3 ... PageInfo productVariantsConnection
4 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... ee1a0910-f021-48e4-a8d0-ab54f4358bde ... PageInfo productVariantsConnection
5 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 17dccf7a-0763-4c34-a537-7b746bdba683 ... PageInfo productVariantsConnection
6 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 90bbee6d-184e-4d8b-8702-77b660883a00 ... PageInfo productVariantsConnection
7 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... f7e51319-0dd3-4c21-9bba-bc8e3f71db94 ... PageInfo productVariantsConnection
8 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 9f889a62-9302-48db-a972-cff035440ee4 ... PageInfo productVariantsConnection
9 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... dd58f053-238f-45f6-b937-687c1e1db3b0 ... PageInfo productVariantsConnection
10 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... 05c37b4e-cf0f-4cf5-a9a8-20ea00029063 ... PageInfo productVariantsConnection
11 UHJvZHVjdDpSR1ZzYVhabGNubERiMjVtYVdjNlZHaDFjbk... e559850a-2344-4bb4-be70-932214aace91 ... PageInfo productVariantsConnection
[12 rows x 30 columns]
>>> df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 12 entries, 0 to 11
Data columns (total 30 columns):
Column Non-Null Count Dtype
0 id 12 non-null object
1 uuid 12 non-null object
2 deliveryConfigId 12 non-null object
3 displayName 12 non-null object
4 priceRanges 12 non-null object
5 priceMin 12 non-null int64
6 priceMax 12 non-null int64
7 actualPriceMin 12 non-null int64
8 actualPriceMax 12 non-null int64
9 slug 12 non-null object
10 label 0 non-null object
11 isInstant 12 non-null bool
12 isInstantOnly 12 non-null bool
13 nextDayAvailability 12 non-null bool
14 heroImage 12 non-null object
15 promo 12 non-null object
16 discount 12 non-null object
17 isDiscount 12 non-null bool
18 variantType 12 non-null object
19 imageIds 12 non-null object
20 isStockAvailable 12 non-null bool
21 defaultVariantSkuCode 12 non-null object
22 quantitySoldFormatted 12 non-null object
23 promotion 0 non-null object
24 __typename 12 non-null object
25 productVariants.productVariant 12 non-null object
26 productVariants.pageInfo.hasPreviousPage 12 non-null bool
27 productVariants.pageInfo.hasNextPage 12 non-null bool
28 productVariants.pageInfo.__typename 12 non-null object
29 productVariants.__typename 12 non-null object
dtypes: bool(7), int64(4), object(19)
memory usage: 2.4+ KB
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论