英文:
Find the attribute value of a tag using Beautiful soup
问题
You can extract the slug
value from the data-gs-ta-val
attribute using Beautiful Soup in Python like this:
from bs4 import BeautifulSoup
html = '''
Your HTML content here
'''
soup = BeautifulSoup(html, 'html.parser')
elements = soup.find_all('li', class_='gs_ta_choice')
slug_values = []
for element in elements:
data_gs_ta_val = element['data-gs-ta-val']
data_gs_ta_val = eval(data_gs_ta_val.replace(''', '"')) # Convert to a dictionary
slug = data_gs_ta_val.get('slug', '')
slug_values.append(slug)
print(slug_values)
This code will extract the slug
values from the data-gs-ta-val
attribute of each li
element and store them in the slug_values
list.
英文:
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Bangalore','value':'Bangalore','CID':'105','id':'105','P':'1','slug':'bangaluru'}" style="line-height: initial;"> Bangalore</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Chennai','value':'Chennai','CID':'106','id':'106','P':'2','slug':'madras'}" style="line-height: initial;"> Chennai</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Mumbai','value':'Mumbai','CID':'108','id':'108','P':'3','slug':'bombay'}" style="line-height: initial;"> Mumbai</li>
I want the slug
value from data-gs-ta-value
from each and every element using beautiful soup python.
答案1
得分: 2
这是您提供的代码的翻译部分:
你没有说明您如何获取HTML片段,所以我假设您已将其作为字符串。
data-gs-ta-val很有趣,因为它看起来是Python字典的字符串表示。
因此:
from bs4 import BeautifulSoup as BS
from ast import literal_eval
html = """
<!DOCTYPE html>
<html>
<body>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Bangalore','value':'Bangalore','CID':'105','id':'105','P':'1','slug':'bangaluru'}" style="line-height: initial;"> Bangalore</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Chennai','value':'Chennai','CID':'106','id':'106','P':'2','slug':'madras'}" style="line-height: initial;"> Chennai</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Mumbai','value':'Mumbai','CID':'108','id':'108','P':'3','slug':'bombay'}" style="line-height: initial;"> Mumbai</li>
</body>
</html>
"""
soup = BS(html, 'lxml')
for li in soup.find_all('li', class_='gs_ta_choice'):
d = literal_eval(li['data-gs-ta-val'])
print(d.get('slug', 'No slug here'))
输出:
bangaluru
madras
bombay
英文:
You don't say how you're getting the HTML fragment so I'll assume you have it as a string.
data-gs-ta-val is interesting because it looks like the associated datum is a string representation of a Python dictionary.
Therefore:
from bs4 import BeautifulSoup as BS
from ast import literal_eval
html = """
<!DOCTYPE html>
<html>
<body>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Bangalore','value':'Bangalore','CID':'105','id':'105','P':'1','slug':'bangaluru'}" style="line-height: initial;"> Bangalore</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Chennai','value':'Chennai','CID':'106','id':'106','P':'2','slug':'madras'}" style="line-height: initial;"> Chennai</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Mumbai','value':'Mumbai','CID':'108','id':'108','P':'3','slug':'bombay'}" style="line-height: initial;"> Mumbai</li>
</body>
</html>
"""
soup = BS(html, 'lxml')
for li in soup.find_all('li', class_='gs_ta_choice'):
d = literal_eval(li['data-gs-ta-val'])
print(d.get('slug', 'No slug here'))
Output:
bangaluru
madras
bombay
答案2
得分: 1
以下是您要翻译的代码部分:
from bs4 import BeautifulSoup
import json
html_doc = """
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Bangalore','value':'Bangalore','CID':'105','id':'105','P':'1','slug':'bangaluru'}" style="line-height: initial;"> Bangalore</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Chennai','value':'Chennai','CID':'106','id':'106','P':'2','slug':'madras'}" style="line-height: initial;"> Chennai</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Mumbai','value':'Mumbai','CID':'108','id':'108','P':'3','slug':'bombay'}" style="line-height: initial;"> Mumbai</li>
"""
soup = BeautifulSoup(html_doc, 'html.parser')
for li in soup.find_all('li'):
data = li.attrs['data-gs-ta-val'].replace("'", '"')
data = json.loads(data)
#print(data)
print(data['slug'])
希望这对您有所帮助。
英文:
from bs4 import BeautifulSoup
import json
html_doc = """
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Bangalore','value':'Bangalore','CID':'105','id':'105','P':'1','slug':'bangaluru'}" style="line-height: initial;"> Bangalore</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Chennai','value':'Chennai','CID':'106','id':'106','P':'2','slug':'madras'}" style="line-height: initial;"> Chennai</li>
<li class="gs_ta_choice" data-value="Bangalore" data-gs-ta-val="{'text':'Mumbai','value':'Mumbai','CID':'108','id':'108','P':'3','slug':'bombay'}" style="line-height: initial;"> Mumbai</li>
"""
soup = BeautifulSoup(html_doc, 'html.parser')
for li in soup.find_all('li'):
data = li.attrs['data-gs-ta-val'].replace("'", '"')
data = json.loads(data)
#print(data)
print(data['slug'])
gives what you want
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论