我写的Python Wordle破解程序哪里出了问题?

huangapple go评论76阅读模式
英文:

Where did I fail in the Wordle cracker I wrote in Python?

问题

I wrote a program that helps me crack Wordle puzzles. However, I guess it is biased on the alphabetical order because of the site I scraped the word list from. When "audio" has no matching letters, my program bugs and doesn't give me any words (even though it has "jewel" in its word list). How could I fix it?

我编写了一个程序来帮助我破解Wordle谜题。但是,由于我从网站上抓取的单词列表存在字母顺序的偏差,所以当"audio"没有任何匹配的字母时,我的程序会出错并且不会给我任何单词(尽管它的单词列表中包含"jewel")。我该如何修复它?

I used this piece of code for getting all the words Wordle uses, as well as creating a list of used letters.

我使用了这段代码来获取Wordle使用的所有单词,以及创建一个已使用字母的列表。

import requests
from bs4 import BeautifulSoup

webpage = 'https://www.wordunscrambler.net/word-list/wordle-word-list'
webpage_response = requests.get(webpage)
content = webpage_response.content
soup = BeautifulSoup(content, 'html.parser')

words = {}
for a in soup.find(class_='content').find_all('a'):
    text = a.get_text()
    if len(text) == 5:
        words[text] = [letter for letter in text]

with open('wordle_words.py', 'w') as file:
    file.write('words = ' + str(words))

I won't share the output, since the list is big. I also made a algorithm for determining which word was the best opener, but I won't share it because it's not important.

我不会分享输出,因为列表很大。我还编写了一个算法来确定哪个单词是最佳的开局词,但我不会分享它,因为它不重要。

This piece of code helps the user play Wordle.

这段代码帮助用户玩Wordle。

from wordle_words import words

starting_word = 'audio'

print('Start by entering the word "audio".')

wrong_letters = set()
matching_letters = set()

for i in range(5):
    
    attempted_word = input('What word did you try? ')
    feedback = input('What were the matching letters? Uppercase for green letters, lowercase for yellow letters and underline for gray ones. "A_Do_": ')

    wrong_letters.update(set([letter for letter in attempted_word.lower() if letter not in feedback.lower()]))
    matching_letters.update(set([letter.lower() for letter in feedback if letter != '_']))
    matching_words = {word: letters for word, letters in words.items()
                      if matching_letters.issubset(set(letters)) and wrong_letters.isdisjoint(set(letters))}

    best_fit = ''
    best_match = 0

    for word, letters in matching_words.items():
        matches = 0
        
        for index in range(5):
            if feedback[index].isupper():
                if letters[index] == feedback[index].lower():
                    matches += 1
                    
            if feedback[index].islower():
                if letters[index] != feedback[index]:
                    matches += 1
                    
        if matches > best_match:
            best_fit = word
            best_match = matches

    print(best_fit + '\n')

I expected it would output "jewel" after I told it "audio" hadn't returned me any matches.

我期望它在我告诉它"audio"没有返回任何匹配后会输出"jewel"。

英文:

I wrote a program that helps me crack Wordle puzzles. However, I guess it is biased on the alphabetical order because of the site I scraped the word list from. When "audio" has no matching letters, my program bugs and doesn't give me any words (even though it has "jewel" in its word list). How could I fix it?

I used this piece of code for getting all the words Wordle uses, as well as creating a list of used letters.

import requests
from bs4 import BeautifulSoup

webpage = 'https://www.wordunscrambler.net/word-list/wordle-word-list'
webpage_response = requests.get(webpage)
content = webpage_response.content
soup = BeautifulSoup(content, 'html.parser')

words = {}
for a in soup.find(class_='content').find_all('a'):
    text = a.get_text()
    if len(text) == 5:
        words[text] = [letter for letter in text]

with open('wordle_words.py', 'w') as file:
    file.write('words = ' + str(words))

I won't share the output, since the list is big. I also made a algorithm for determining which word was the best opener, but I won't share it because it's not important.

This piece of code helps the user play Wordle.

from wordle_words import words

starting_word = 'audio'

print('Start by entering the word "audio".')

wrong_letters = set()
matching_letters = set()

for i in range(5):
    
    attemped_word = input('What word did you try? ')
    feedback = input('What were the matching letters? Uppercase for green letters, lowercase for yellow letters and underline for gray ones. "A_Do_": ')

    wrong_letters.update(set([letter for letter in attemped_word.lower() if letter not in feedback.lower()]))
    matching_letters.update(set([letter.lower() for letter in feedback if letter != '_']))
    matching_words = {word: letters for word, letters in words.items()
                      if matching_letters.issubset(set(letters)) and wrong_letters.isdisjoint(set(letters))}

    best_fit = ''
    best_match = 0

    for word, letters in matching_words.items():
        matches = 0
        
        for index in range(5):
            if feedback[index].isupper():
                if letters[index] == feedback[index].lower():
                    matches += 1
                    
            if feedback[index].islower():
                if letters[index] != feedback[index]:
                    matches += 1
                    
        if matches > best_match:
            best_fit = word
            best_match = matches

    print(best_fit + '\n')

I expected it would output "jewel" after I told it "audio" hadn't returned me any matches.

答案1

得分: 3

这部分代码存在一个问题:

for index in range(5):
    if feedback[index].isupper():
        if letters[index] == feedback[index].lower():
            matches += 1
                
    if feedback[index].islower():
        if letters[index] != feedback[index]:
            matches += 1

问题在于下划线既不是大写字母也不是小写字母。您需要为未匹配的字母添加一个额外的分支,并且对于该位置始终应该执行 match += 1

for index in range(5):
    if feedback[index].isupper():
        if letters[index] == feedback[index].lower():
            matches += 1
                
    if feedback[index].islower():
        if letters[index] != feedback[index]:
            matches += 1

    if feedback[index] == "_":
        matches += 1

实际上还有另一个问题,这不在您的问题中提到。您正在增加 matches 的值并获取更好的匹配,但这可能仍然会导致无法匹配的单词。

例如,如果您猜测 "audio" 并得到 "A_Do_",它可能会建议您 "today"。只有在要猜测的实际单词不在单词列表中时才会出现问题(否则,那将具有更好的得分),因此如果您正确提取单词,则可能永远不会发生这种情况。不管怎样,值得一提。

英文:

There is a flaw in this part:

for index in range(5):
    if feedback[index].isupper():
        if letters[index] == feedback[index].lower():
            matches += 1
            
    if feedback[index].islower():
        if letters[index] != feedback[index]:
            matches += 1

The problem is that an underscore is neither lower or upper. You need an additional branch for missed letters, and that should match += 1 always for that position.

for index in range(5):
    if feedback[index].isupper():
        if letters[index] == feedback[index].lower():
            matches += 1
            
    if feedback[index].islower():
        if letters[index] != feedback[index]:
            matches += 1

    if feedback[index] == "_":
        matches += 1

There is actually another flaw that is not mentioned in your question. You are incrementing the matches value and getting the better one, but this can give you unmatched words anyway.

For example, if you guessed "audio" and got "A_Do_", it might suggest you "today". It would only be a problem if the actual word to guess is not in the word list (otherwise, that would have the better score), so maybe you'd never see that happen if you are scrapping right. Anyway, worth mentioning.

huangapple
  • 本文由 发表于 2023年7月17日 22:31:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/76705510.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定