英文:
Get value of login_token/login_id from jquery using python requests from a asp.net URL
问题
Here's the translated code portion:
with requests.Session() as s:
data = {
'email': username,
'user_password': password,
}
s.headers['User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36'
url = "https://www.turnitin.com/login_page.asp?lang=en_us"
res = s.post(url, data=data)
html_text = res.text
soup = BeautifulSoup(html_text, "lxml")
print(soup.find('input', {'name': 'login_token'}))
Please note that the provided code is attempting to scrape the value of the 'login_token' input field from the specified URL using Beautiful Soup. If you encounter issues with scraping the 'login_id' and 'login_token' values, it could be due to JavaScript-generated content that Beautiful Soup can't directly access. In such cases, you may need to explore other methods like using Selenium to interact with the webpage and retrieve the values after JavaScript execution.
英文:
How could i possibly get the value of login_token and login_id before submitting it using python requests.
<script>
jQuery(document).ready(
function($) {
$("#ibox_form").css("display", "block");
$(".noscript").css("display", "none");
try {
var fpPromise = import("/r/build/js/tii/1dc0524e24cc01f176e3cec8bd0af1e1cb_gb_fp.js").then(FingerprintJS => FingerprintJS.load());
fpPromise.then(fp => fp.get()).then(result => {
$("form[name='FormName']").append('<input name="browser_fp" type="hidden" value="'+result.visitorId+'" />');
});
} catch (e) {
console.error(e);
}
$("form[name='FormName'] input[name='email']").focus();
$("form[name='FormName']").submit(function(event) {
if ($("input[name='login_id']").length !== 1 && $("input[name='login_token']").length !== 1) {
$("form[name='FormName']").append('<input name="login_id" type="hidden" value="150EA3F6-FA3F-11ED-A6EB-E4CE65535679" />');
$("form[name='FormName']").append('<input name="login_token" type="hidden" value="6a83d1773c3256d1a5d26dc597c68975e27ea46e" />');
}
var recaptcha = document.getElementById("g-recaptcha-response");
if (recaptcha && recaptcha.value == "") {
var formIsValid = IP.control.AutoValidator.getFormValidator(document.FormName).isValid();
if (formIsValid) {
alert("You must check the box that proves you're not a robot.");
event.preventDefault();
return false;
}
} else if (localStorage) {
localStorage.setItem("login.start", new Date().getTime().toString());
}
});
}
);
</script>
The url is asp.net - "https://www.turnitin.com/login_page.asp?lang=en_us"
This is my code from python:
with requests.Session() as s:
data = {
'email': username,
'user_password': password,
}
s.headers['User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36'
url = "https://www.turnitin.com/login_page.asp?lang=en_us"
res = s.post(url, data=data)
html_text = res.text
soup = BeautifulSoup(html_text, "lxml")
print(soup.find('input', {'name': 'login_token'}))
I am currently using beautiful soup for this but i can't get the value of login_id and login_token. Btw the url is from asp.net
答案1
得分: 1
以下是您要翻译的内容:
尽管您可以使用 bs4
来定位和提取 <script>
标签,但您可能无法从中获取值,因为这是一个 jQuery
。
但是,如果您只需要这些值,您可以使用 re
。
例如,根据您的 sample_html
,尝试以下代码:
import re
login_id = re.compile(r"(?i)[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}")
login_token = re.compile(r"[0-9a-f]{40}")
print(login_id.search(sample_html).group())
print(login_token.search(sample_html).group())
这应该会打印:
150EA3F6-FA3F-11ED-A6EB-E4CE65535679
6a83d1773c3256d1a5d26dc597c68975e27ea46e
将所有内容放在一起:
import re
import requests
with requests.Session() as s:
data = {
'email': username,
'user_password': password,
}
s.headers['User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) ' \
'AppleWebKit/537.36 (KHTML, like Gecko) ' \
'Chrome/113.0.0.0 Safari/537.36'
url = "https://www.turnitin.com/login_page.asp?lang=en_us"
res = s.post(url, data=data)
login_id = re.compile(r"(?i)[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}")
login_token = re.compile(r"[0-9a-f]{40}")
print(login_id.search(res.text).group())
print(login_token.search(res.text).group())
请注意,我已经移除了HTML标记和特殊字符。
英文:
Although, you can target and extract the <script>
tag with bs4
you might not be able to get the values out of it, as this is a jQuery
.
However, if all you need are those values, you can use re
.
For example, based on your sample_html
, try this:
import re
login_id = re.compile(r"(?i)[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}")
login_token = re.compile(r"[0-9a-f]{40}")
print(login_id.search(sample_html).group())
print(login_token.search(sample_html).group())
This should print:
150EA3F6-FA3F-11ED-A6EB-E4CE65535679
6a83d1773c3256d1a5d26dc597c68975e27ea46e
Putting it all together:
import re
import requests
with requests.Session() as s:
data = {
'email': username,
'user_password': password,
}
s.headers['User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) ' \
'AppleWebKit/537.36 (KHTML, like Gecko) ' \
'Chrome/113.0.0.0 Safari/537.36'
url = "https://www.turnitin.com/login_page.asp?lang=en_us"
res = s.post(url, data=data)
login_id = re.compile(r"(?i)[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}")
login_token = re.compile(r"[0-9a-f]{40}")
print(login_id.search(res.text).group())
print(login_token.search(res.text).group())
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论