2023年6月19日 04:11:51go评论93阅读模式

英文:

What would be a good way to write a function that returns a random chess game played by two masters as a PGN file?

问题

我正在尝试创建一个返回随机国际象棋大师比赛的函数，保存为 .pgn 文件。

我采取的方法是下载了caissabase国际象棋数据库，它非常庞大，包含数百万场国际象棋比赛。我的最初计划是从这个文件中随机读取一场国际象棋比赛，如下所示：

def extract_random_game(pgn_file, output_file):
    random_game = None
    num_games = 0

    with open(pgn_file) as pgn:
        while True:
            game = chess.pgn.read_game(pgn)
            if game is None:
                break
            num_games += 1
            if random.randint(1, num_games) == 1:
                random_game = game

    if num_games == 0:
        print("PGN文件中找不到比赛。")
        return

    with open(output_file, 'w') as new_pgn:
        new_pgn.write(str(random_game))

    print("随机比赛已提取并保存到：", output_file)

然而，这似乎运行时间很长，然后崩溃，难以调试。

我还尝试通过索引提取比赛，对于较低的索引号，这似乎效果良好，但如果我尝试提取像比赛＃1000000这样的比赛，运行时间也会较长：

def extract_game_by_index(pgn_file, game_index, output_file):
    game_counter = 0
    target_game = None

    with open(pgn_file) as pgn:
        while True:
            game = chess.pgn.read_game(pgn)
            if game is None:
                break

            game_counter += 1
            if game_counter == game_index:
                target_game = game
                break

    if target_game is None:
        print(f"在PGN文件中找不到索引为{game_index}的比赛。")
        return

    with open(output_file, 'w') as new_pgn:
        new_pgn.write(str(target_game))

    print(f"在索引{game_index}处提取的比赛已保存到：", output_file)

对于修改此代码或采取不同方法的任何想法吗？

英文:

I am trying to create a function that returns a random master chess game as a .pgn file.

The approach I have taken was I have downloaded the caissabase chess database which is quite large and contains millions of chess games. My original plan was to simply read a random chess game from this file like so:

def extract_random_game(pgn_file, output_file):
    random_game = None
    num_games = 0

    with open(pgn_file) as pgn:
        while True:
            game = chess.pgn.read_game(pgn)
            if game is None:
                break
            num_games += 1
            if random.randint(1, num_games) == 1:
                random_game = game

    if num_games == 0:
        print(&quot;No games found in the PGN file.&quot;)
        return

    with open(output_file, &#39;w&#39;) as new_pgn:
        new_pgn.write(str(random_game))

    print(&quot;Random game extracted and saved to:&quot;, output_file)

However, this seems to take a long time to run and then it crashes and is hard to debug.

I've also tried extracting a game by its index, which seems to work well for low index numbers, but if I try to extract something like game #1000000, it takes a while to run as well:

def extract_game_by_index(pgn_file, game_index, output_file):
    game_counter = 0
    target_game = None

    with open(pgn_file) as pgn:
        while True:
            game = chess.pgn.read_game(pgn)
            if game is None:
                break

            game_counter += 1
            if game_counter == game_index:
                target_game = game
                break

    if target_game is None:
        print(f&quot;Game not found at index {game_index} in the PGN file.&quot;)
        return

    with open(output_file, &#39;w&#39;) as new_pgn:
        new_pgn.write(str(target_game))

    print(f&quot;Game at index {game_index} extracted and saved to:&quot;, output_file)

Any ideas on ways to modify this code or a different approach that I could take?

答案1

得分: 0

不需要翻译的部分已被省略。以下是翻译好的内容：

"Instead of reading the whole game, you can just read the game headers. According to the documentation (look at the chess.pgn.read_headers), it reduces the processing time for big files.

We can thus create a game picker only using headers. From the documentation, you have to retrieve the file offset before reading the header, in order to be able to read the full game later on. This solution returns the chess game from index (the first game has index 1):

def pick_game(pgn, index):
    offset = 0
    for _ in range(index):
        offset = pgn.tell()
        chess.pgn.read_headers(pgn)
    pgn.seek(offset)
    return chess.pgn.read_game(pgn)

Then using the same idea, you can recover the total number of game with:

def num_games(pgn):
    num = 0
    while chess.pgn.read_headers(pgn):
        num += 1
    return num

Finally, you should pick the random game index once and for all at the beginning with

total_num_games = num_games(open(filepath_to_pgn, "r"))
random_game_index = random.randint(1, total_num_games)
game = pick_game(open(filepath_to_pgn, "r"), random_game_index)
```"

<details>
<summary>英文:</summary>

Instead of reading the whole game, you can just read the game headers. According to the [documentation](https://python-chess.readthedocs.io/en/latest/pgn.html) (look at the `chess.pgn.read_headers`), it reduces the processing time for big files.

We can thus create a game picker only using headers. From the documentation,  you have to retrieve the file offset before reading the header, in order to be able to read the full game later on. This solution returns the chess game from index (the first game has index 1):
```python
def pick_game(pgn, index):
    offset = 0
    for _ in range(index):
        offset = pgn.tell()
        chess.pgn.read_headers(pgn)
    pgn.seek(offset)
    return chess.pgn.read_game(pgn)

Then using the same idea, you can recover the total number of game with:

def num_games(pgn):
    num = 0
    while chess.pgn.read_headers(pgn):
        num += 1
    return num

Finally, you should pick the random game index once and for all at the beginning with

total_num_games = num_games(open(filepath_to_pgn, &quot;r&quot;))
random_game_index = random.randint(1, total_num_games)
game = pick_game(open(filepath_to_pgn, &quot;r&quot;), random_game_index)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

What would be a good way to write a function that returns a random chess game played by two masters as a PGN file?

问题

答案1

Pandas: Shape of passed values is (10, 1), indices imply (10, 5) error when trying to append a dict to an existing Dataframe

如何在pyplot图表上使用千位分隔符

Python中的唯一子列表

如何在循环中使用数字列表作为索引？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论