如何使用Python检查网站是否使用WordPress编写?

huangapple go评论95阅读模式
英文:

How can I use python to check if a site is written in WordPress?

问题

我需要使用Python检查多个网站,看它们是否使用WordPress编写,我该如何做?我使用requests和BeautifulSoup4。

我尝试检查页面是否包含"wp-content"或"wordpress"内容,但没有任何效果。

英文:

I need to check several sites with Python to see if they are written in WordPress how can I do this? I use requests and BeautifulSoup4

I tried checking if the page has content with "wp-content" or "wordpress" but nothing works.

答案1

得分: 1

更可靠的方法是分析网站的HTML源代码,查找与WordPress安装常见相关的特定模式或元数据。其中一种指示器是存在特定的HTML标签或CSS类,这些通常由WordPress主题和插件使用。

  1. import requests
  2. from bs4 import BeautifulSoup
  3. def check_wordpress(url):
  4. response = requests.get(url)
  5. soup = BeautifulSoup(response.text, 'html.parser')
  6. if soup.find('meta', attrs={'name': 'generator', 'content': 'WordPress'}) is not None:
  7. return True
  8. if soup.find('link', attrs={'rel': 'stylesheet', 'href': '/wp-content/'}) is not None:
  9. return True
  10. if soup.find('script', attrs={'src': '/wp-includes/js/wp-embed.min.js'}) is not None:
  11. return True
  12. # 如果以上条件都不匹配,很可能不是WordPress网站
  13. return False
  14. # 示例用法
  15. url = 'https://example.com'
  16. is_wordpress = check_wordpress(url)
  17. print(f"The website at {url} is built with WordPress: {is_wordpress}")
英文:

More reliable approach is to analyze the HTML source code of the website and look for specific patterns or metadata that are commonly associated with WordPress installations. One such indicator is the presence of specific HTML tags or CSS classes that are commonly used by WordPress themes and plugins.

  1. import requests
  2. from bs4 import BeautifulSoup
  3. def check_wordpress(url):
  4. response = requests.get(url)
  5. soup = BeautifulSoup(response.text, 'html.parser')
  6. if soup.find('meta', attrs={'name': 'generator', 'content': 'WordPress'}) is not None:
  7. return True
  8. if soup.find('link', attrs={'rel': 'stylesheet', 'href': '/wp-content/'}) is not None:
  9. return True
  10. if soup.find('script', attrs={'src': '/wp-includes/js/wp-embed.min.js'}) is not None:
  11. return True
  12. # If none of the above conditions match, it is likely not a WordPress site
  13. return False
  14. # Example usage
  15. url = 'https://example.com'
  16. is_wordpress = check_wordpress(url)
  17. print(f"The website at {url} is built with WordPress: {is_wordpress}")

答案2

得分: 1

以下是翻译好的代码部分:

  1. 你可以检查网站的HTML代码并查找WordPress网站常见的特定模式或元素首先你需要使用以下命令安装Python`requests`
  2. pip install requests
  3. 然后你可以使用以下代码检查网站的HTML是否包含WordPress网站常见的关键词
  4. ```python
  5. import requests
  6. import re
  7. def is_wordpress_site(url):
  8. response = requests.get(url)
  9. html_code = response.text
  10. # 在HTML代码中检查常见的WordPress模式
  11. patterns = [
  12. r'wp-content',
  13. r'wp-admin',
  14. r'wp-includes',
  15. r'wordpress\/'
  16. ]
  17. for pattern in patterns:
  18. if re.search(pattern, html_code, re.IGNORECASE):
  19. return True
  20. return False
  21. def is_wordpress_site_wrapper(url):
  22. if(is_wordpress_site(url)):
  23. print(f'该网站 {url} 使用WordPress构建。')
  24. else:
  25. print(f'该网站 {url} 不是使用WordPress构建的。')
  26. if __name__ == '__main__':
  27. web1_url = 'https://time.com' # 使用WordPress构建
  28. web2_url = 'https://stackoverflow.com' # 不是使用WordPress构建
  29. is_wordpress_site_wrapper(web1_url)
  30. is_wordpress_site_wrapper(web2_url)

希望这对你有帮助!

英文:

You can examine the website's HTML code and look for specific patterns or elements that are commonly present in WordPress sites. You first need to install requests package for Python using

  1. pip install requests

Then you can use the following code to check whether the HTML of the site contains certain keywords that WordPress sites contain:

  1. import requests
  2. import re
  3. def is_wordpress_site(url):
  4. response = requests.get(url)
  5. html_code = response.text
  6. # Check for common WordPress patterns in HTML code
  7. patterns = [
  8. r'wp-content',
  9. r'wp-admin',
  10. r'wp-includes',
  11. r'wordpress\/'
  12. ]
  13. for pattern in patterns:
  14. if re.search(pattern, html_code, re.IGNORECASE):
  15. return True
  16. return False
  17. def is_wordpress_site_wrapper(url):
  18. if(is_wordpress_site(url)):
  19. print(f'The website {url} is built using WordPress.')
  20. else:
  21. print(f'The website {url} is NOT built using WordPress')
  22. if __name__ == '__main__':
  23. web1_url = 'https://time.com' # Built using WordPress
  24. web2_url = 'https://stackoverflow.com' # Not built using WordPress
  25. is_wordpress_site_wrapper(web1_url)
  26. is_wordpress_site_wrapper(web2_url)

huangapple
  • 本文由 发表于 2023年6月11日 20:44:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/76450530.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定