如何使用Python检查网站是否使用WordPress编写?

huangapple go评论69阅读模式
英文:

How can I use python to check if a site is written in WordPress?

问题

我需要使用Python检查多个网站,看它们是否使用WordPress编写,我该如何做?我使用requests和BeautifulSoup4。

我尝试检查页面是否包含"wp-content"或"wordpress"内容,但没有任何效果。

英文:

I need to check several sites with Python to see if they are written in WordPress how can I do this? I use requests and BeautifulSoup4

I tried checking if the page has content with "wp-content" or "wordpress" but nothing works.

答案1

得分: 1

更可靠的方法是分析网站的HTML源代码,查找与WordPress安装常见相关的特定模式或元数据。其中一种指示器是存在特定的HTML标签或CSS类,这些通常由WordPress主题和插件使用。

import requests
from bs4 import BeautifulSoup

def check_wordpress(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')

    if soup.find('meta', attrs={'name': 'generator', 'content': 'WordPress'}) is not None:
        return True

    if soup.find('link', attrs={'rel': 'stylesheet', 'href': '/wp-content/'}) is not None:
        return True

    if soup.find('script', attrs={'src': '/wp-includes/js/wp-embed.min.js'}) is not None:
        return True

    # 如果以上条件都不匹配,很可能不是WordPress网站
    return False

# 示例用法
url = 'https://example.com'
is_wordpress = check_wordpress(url)
print(f"The website at {url} is built with WordPress: {is_wordpress}")
英文:

More reliable approach is to analyze the HTML source code of the website and look for specific patterns or metadata that are commonly associated with WordPress installations. One such indicator is the presence of specific HTML tags or CSS classes that are commonly used by WordPress themes and plugins.

import requests
from bs4 import BeautifulSoup

def check_wordpress(url):
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')

    if soup.find('meta', attrs={'name': 'generator', 'content': 'WordPress'}) is not None:
        return True

    if soup.find('link', attrs={'rel': 'stylesheet', 'href': '/wp-content/'}) is not None:
        return True

    if soup.find('script', attrs={'src': '/wp-includes/js/wp-embed.min.js'}) is not None:
        return True

    # If none of the above conditions match, it is likely not a WordPress site
    return False

# Example usage
url = 'https://example.com'
is_wordpress = check_wordpress(url)
print(f"The website at {url} is built with WordPress: {is_wordpress}")

答案2

得分: 1

以下是翻译好的代码部分:

你可以检查网站的HTML代码并查找WordPress网站常见的特定模式或元素首先你需要使用以下命令安装Python的`requests`

    pip install requests

然后你可以使用以下代码检查网站的HTML是否包含WordPress网站常见的关键词

```python
import requests
import re

def is_wordpress_site(url):
    response = requests.get(url)
    html_code = response.text

    # 在HTML代码中检查常见的WordPress模式
    patterns = [
        r'wp-content',
        r'wp-admin',
        r'wp-includes',
        r'wordpress\/'
    ]

    for pattern in patterns:
        if re.search(pattern, html_code, re.IGNORECASE):
            return True

    return False

def is_wordpress_site_wrapper(url):
    if(is_wordpress_site(url)):
        print(f'该网站 {url} 使用WordPress构建。')
    else:
        print(f'该网站 {url} 不是使用WordPress构建的。')


if __name__ == '__main__':
    web1_url = 'https://time.com'   # 使用WordPress构建
    web2_url = 'https://stackoverflow.com' # 不是使用WordPress构建
    is_wordpress_site_wrapper(web1_url)
    is_wordpress_site_wrapper(web2_url)

希望这对你有帮助!

英文:

You can examine the website's HTML code and look for specific patterns or elements that are commonly present in WordPress sites. You first need to install requests package for Python using

pip install requests

Then you can use the following code to check whether the HTML of the site contains certain keywords that WordPress sites contain:

import requests
import re

def is_wordpress_site(url):
    response = requests.get(url)
    html_code = response.text

    # Check for common WordPress patterns in HTML code
    patterns = [
        r'wp-content',
        r'wp-admin',
        r'wp-includes',
        r'wordpress\/'
    ]

    for pattern in patterns:
        if re.search(pattern, html_code, re.IGNORECASE):
            return True

    return False

def is_wordpress_site_wrapper(url):
    if(is_wordpress_site(url)):
        print(f'The website {url} is built using WordPress.')
    else:
        print(f'The website {url} is NOT built using WordPress')


if __name__ == '__main__':
    web1_url = 'https://time.com'   # Built using WordPress
    web2_url = 'https://stackoverflow.com' # Not built using WordPress
    is_wordpress_site_wrapper(web1_url)
    is_wordpress_site_wrapper(web2_url)

huangapple
  • 本文由 发表于 2023年6月11日 20:44:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/76450530.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定