英文:
Web scrape a a title after a specific class by python
问题
抱歉,您提供的内容似乎包含了代码和文本,您要求只翻译文本部分,以下是文本的翻译:
I'm trying to scrape some information about the positions, artists and songs from a ranking list online. Here is the ranking list website: https://kma.kkbox.com/charts/weekly/newrelease?terr=my&lang=en
I was trying to use the following code to scrape:
import requests
from bs4 import BeautifulSoup
page = requests.get('https://kma.kkbox.com/charts/weekly/newrelease?terr=my&lang=en')
print(page.status_code)
soup = BeautifulSoup(page.content, 'html.parser')
all_songs = soup.find_all(class_="charts-list-song")
all_artists = soup.find_all(class_="charts-list-artist")
print(all_songs)
print(all_artists)
However, the output only shows:
[<span class="charts-list-desc">
<span class="charts-list-song"></span>
<span class="charts-list-artist"></span>
</span>, <span class="charts-list-desc">
<span class="charts-list-song"></span>
...
and
<span class="charts-list-song"></span>, <span class="charts-list-song"></span>, <span class="charts-list-song"></span>, <span class="charts-list-song"></span>, <span class="charts-list-song"></span>, <span class="charts-list-song"></span>,
My expected output should be:
Pos artist songs
1 張哲瀚 洪荒劇場Primordial Theater
2 張哲瀚 冰川消失那天Lost Glacier
3 告五人 又到天黑
英文:
I'm trying to scrape some information about the positions, artists and songs from a ranking list online. Here is the ranking list website: https://kma.kkbox.com/charts/weekly/newrelease?terr=my&lang=en
I'm was trying to use the following code to scrape:
import requests
from bs4 import BeautifulSoup
page = requests.get('https://kma.kkbox.com/charts/weekly/newrelease?terr=my&lang=en')
print(page.status_code)
soup = BeautifulSoup(page.content, 'html.parser')
all_songs = soup.find_all(class_="charts-list-song")
all_artists = soup.find_all(class_="charts-list-artist")
print(all_songs)
print(all_artists)
However, the output only shows:
[<span class="charts-list-desc">
<span class="charts-list-song"></span>
<span class="charts-list-artist"></span>
</span>, <span class="charts-list-desc">
<span class="charts-list-song"></span>
...
and
<span class="charts-list-song"></span>, <span class="charts-list-song"></span>, <span class="charts-list-song"></span>, <span class="charts-list-song"></span>, <span class="charts-list-song"></span>, <span class="charts-list-song"></span>,
My expected output should be:
Pos artist songs
1 張哲瀚 洪荒劇場Primordial Theater
2 張哲瀚 冰川消失那天Lost Glacier
3 告五人 又到天黑
答案1
得分: 0
以下是您要翻译的内容:
Use view source
in Chrome, you can see that the actual chart content is at the end of the html source code and loaded as chart
variable.
code
import requests
from bs4 import BeautifulSoup
import json, re
page = requests.get('https://kma.kkbox.com/charts/weekly/newrelease?terr=my&lang=en')
print(page.status_code)
soup = BeautifulSoup(page.content, 'html.parser')
data = soup.select('script')[-2].string
m = re.search(r'var chart = (\[{.*}\])', data)
songs = json.loads(m.group(1))
for song in songs:
print(song['rankings']['this_period'], song['artist_name'], song['song_name'])
output
1 張哲瀚 洪荒劇場Primordial Theater
2 張哲瀚 冰川消失那天Lost Glacier
3 告五人 又到天黑
4 孫盛希 Shi Shi 眼淚記得你 (Remembered)
5 陳零九 Nine Chen 夢裡的女孩 (The Girl)
6 告五人 一念之間
7 苏有朋 玫瑰急救箱
8 林俊傑 想見你想見你想見你
...
英文:
Use view source
in Chrome, you can see that the actual chart content is at the end of the html source code and loaded as chart
variable.
code
import requests
from bs4 import BeautifulSoup
import json, re
page = requests.get('https://kma.kkbox.com/charts/weekly/newrelease?terr=my&lang=en')
print(page.status_code)
soup = BeautifulSoup(page.content, 'html.parser')
data = soup.select('script')[-2].string
m = re.search(r'var chart = (\[{.*}\])', data)
songs = json.loads(m.group(1))
for song in songs:
print(song['rankings']['this_period'], song['artist_name'], song['song_name'])
output
1 張哲瀚 洪荒劇場Primordial Theater
2 張哲瀚 冰川消失那天Lost Glacier
3 告五人 又到天黑
4 孫盛希 Shi Shi 眼淚記得你 (Remembered)
5 陳零九 Nine Chen 夢裡的女孩 (The Girl)
6 告五人 一念之間
7 苏有朋 玫瑰急救箱
8 林俊傑 想見你想見你想見你
...
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论