2023年5月25日 00:07:33go评论209阅读模式

英文:

Get value of login_token/login_id from jquery using python requests from a asp.net URL

问题

Here's the translated code portion:

with requests.Session() as s:
    data = {
        'email': username,
        'user_password': password,
    }
    s.headers['User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36'
    url = "https://www.turnitin.com/login_page.asp?lang=en_us"
    res = s.post(url, data=data)
    html_text = res.text
    soup = BeautifulSoup(html_text, "lxml")
    print(soup.find('input', {'name': 'login_token'}))

Please note that the provided code is attempting to scrape the value of the 'login_token' input field from the specified URL using Beautiful Soup. If you encounter issues with scraping the 'login_id' and 'login_token' values, it could be due to JavaScript-generated content that Beautiful Soup can't directly access. In such cases, you may need to explore other methods like using Selenium to interact with the webpage and retrieve the values after JavaScript execution.

英文:

How could i possibly get the value of login_token and login_id before submitting it using python requests.

&lt;script&gt;
    jQuery(document).ready(
        function($) {
            $(&quot;#ibox_form&quot;).css(&quot;display&quot;, &quot;block&quot;);
            $(&quot;.noscript&quot;).css(&quot;display&quot;, &quot;none&quot;);

            try {
                var fpPromise = import(&quot;/r/build/js/tii/1dc0524e24cc01f176e3cec8bd0af1e1cb_gb_fp.js&quot;).then(FingerprintJS =&gt; FingerprintJS.load());
                fpPromise.then(fp =&gt; fp.get()).then(result =&gt; {
                    $(&quot;form[name=&#39;FormName&#39;]&quot;).append(&#39;&lt;input name=&quot;browser_fp&quot; type=&quot;hidden&quot; value=&quot;&#39;+result.visitorId+&#39;&quot; /&gt;&#39;);
                });
            } catch (e) {
                console.error(e);
            }

            $(&quot;form[name=&#39;FormName&#39;] input[name=&#39;email&#39;]&quot;).focus();
            $(&quot;form[name=&#39;FormName&#39;]&quot;).submit(function(event) {

                if ($(&quot;input[name=&#39;login_id&#39;]&quot;).length !== 1 &amp;&amp; $(&quot;input[name=&#39;login_token&#39;]&quot;).length !== 1) {
                    $(&quot;form[name=&#39;FormName&#39;]&quot;).append(&#39;&lt;input name=&quot;login_id&quot; type=&quot;hidden&quot; value=&quot;150EA3F6-FA3F-11ED-A6EB-E4CE65535679&quot; /&gt;&#39;);
                    $(&quot;form[name=&#39;FormName&#39;]&quot;).append(&#39;&lt;input name=&quot;login_token&quot; type=&quot;hidden&quot; value=&quot;6a83d1773c3256d1a5d26dc597c68975e27ea46e&quot; /&gt;&#39;);
                }

                var recaptcha = document.getElementById(&quot;g-recaptcha-response&quot;);
                if (recaptcha &amp;&amp; recaptcha.value == &quot;&quot;) {
                    var formIsValid = IP.control.AutoValidator.getFormValidator(document.FormName).isValid();
                    if (formIsValid) {
                        alert(&quot;You must check the box that proves you&#39;re not a robot.&quot;);
                        event.preventDefault();
                        return false;
                    }
                } else if (localStorage) {
                    localStorage.setItem(&quot;login.start&quot;, new Date().getTime().toString());
                }
            });
        }
    );
&lt;/script&gt;

The url is asp.net - "https://www.turnitin.com/login_page.asp?lang=en_us"

This is my code from python:

 with requests.Session() as s:
            data = {
                &#39;email&#39;: username,
                &#39;user_password&#39;: password,
            }
            s.headers[&#39;User-Agent&#39;] = &#39;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36&#39;
            url = &quot;https://www.turnitin.com/login_page.asp?lang=en_us&quot;
            res = s.post(url, data=data)
            html_text = res.text
            soup = BeautifulSoup(html_text, &quot;lxml&quot;)
            print(soup.find(&#39;input&#39;, {&#39;name&#39;: &#39;login_token&#39;}))

I am currently using beautiful soup for this but i can't get the value of login_id and login_token. Btw the url is from asp.net

答案1

得分: 1

以下是您要翻译的内容：

尽管您可以使用 bs4 来定位和提取 <script> 标签，但您可能无法从中获取值，因为这是一个 jQuery。

但是，如果您只需要这些值，您可以使用 re。

例如，根据您的 sample_html，尝试以下代码：

import re

login_id = re.compile(r"(?i)[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}")
login_token = re.compile(r"[0-9a-f]{40}")

print(login_id.search(sample_html).group())
print(login_token.search(sample_html).group())

这应该会打印：

150EA3F6-FA3F-11ED-A6EB-E4CE65535679
6a83d1773c3256d1a5d26dc597c68975e27ea46e

将所有内容放在一起：

import re
import requests

with requests.Session() as s:
    data = {
        'email': username,
        'user_password': password,
    }
    s.headers['User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) ' \
                              'AppleWebKit/537.36 (KHTML, like Gecko) ' \
                              'Chrome/113.0.0.0 Safari/537.36'
    url = "https://www.turnitin.com/login_page.asp?lang=en_us"
    res = s.post(url, data=data)

login_id = re.compile(r"(?i)[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}")
login_token = re.compile(r"[0-9a-f]{40}")

print(login_id.search(res.text).group())
print(login_token.search(res.text).group())

请注意，我已经移除了HTML标记和特殊字符。

英文:

Although, you can target and extract the <script> tag with bs4 you might not be able to get the values out of it, as this is a jQuery.

However, if all you need are those values, you can use re.

For example, based on your sample_html, try this:

import re


login_id = re.compile(r&quot;(?i)[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}&quot;)
login_token = re.compile(r&quot;[0-9a-f]{40}&quot;)

print(login_id.search(sample_html).group())
print(login_token.search(sample_html).group())

This should print:

150EA3F6-FA3F-11ED-A6EB-E4CE65535679
6a83d1773c3256d1a5d26dc597c68975e27ea46e

Putting it all together:

import re

import requests

with requests.Session() as s:
    data = {
        &#39;email&#39;: username,
        &#39;user_password&#39;: password,
    }
    s.headers[&#39;User-Agent&#39;] = &#39;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) &#39; \
                              &#39;AppleWebKit/537.36 (KHTML, like Gecko) &#39; \
                              &#39;Chrome/113.0.0.0 Safari/537.36&#39;
    url = &quot;https://www.turnitin.com/login_page.asp?lang=en_us&quot;
    res = s.post(url, data=data)

login_id = re.compile(r&quot;(?i)[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}&quot;)
login_token = re.compile(r&quot;[0-9a-f]{40}&quot;)

print(login_id.search(res.text).group())
print(login_token.search(res.text).group())

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Get value of login_token/login_id from jquery using python requests from a asp.net URL

问题

答案1

HttpClient请求抛出了HttpRequestException异常

如何使用ASP.NET菜单控件在水平线上显示菜单项？

网页抓取数据的格式化 BS4

如何将亚马逊上的图书信息转化为表格形式？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论