What would be a good way to write a function that returns a random chess game played by two masters as a PGN file?

huangapple go评论93阅读模式
英文:

What would be a good way to write a function that returns a random chess game played by two masters as a PGN file?

问题

我正在尝试创建一个返回随机国际象棋大师比赛的函数,保存为 .pgn 文件。

我采取的方法是下载了caissabase国际象棋数据库,它非常庞大,包含数百万场国际象棋比赛。我的最初计划是从这个文件中随机读取一场国际象棋比赛,如下所示:

def extract_random_game(pgn_file, output_file):
    random_game = None
    num_games = 0

    with open(pgn_file) as pgn:
        while True:
            game = chess.pgn.read_game(pgn)
            if game is None:
                break
            num_games += 1
            if random.randint(1, num_games) == 1:
                random_game = game

    if num_games == 0:
        print("PGN文件中找不到比赛。")
        return

    with open(output_file, 'w') as new_pgn:
        new_pgn.write(str(random_game))

    print("随机比赛已提取并保存到:", output_file)

然而,这似乎运行时间很长,然后崩溃,难以调试。

我还尝试通过索引提取比赛,对于较低的索引号,这似乎效果良好,但如果我尝试提取像比赛#1000000这样的比赛,运行时间也会较长:

def extract_game_by_index(pgn_file, game_index, output_file):
    game_counter = 0
    target_game = None

    with open(pgn_file) as pgn:
        while True:
            game = chess.pgn.read_game(pgn)
            if game is None:
                break

            game_counter += 1
            if game_counter == game_index:
                target_game = game
                break

    if target_game is None:
        print(f"在PGN文件中找不到索引为{game_index}的比赛。")
        return

    with open(output_file, 'w') as new_pgn:
        new_pgn.write(str(target_game))

    print(f"在索引{game_index}处提取的比赛已保存到:", output_file)

对于修改此代码或采取不同方法的任何想法吗?

英文:

I am trying to create a function that returns a random master chess game as a .pgn file.

The approach I have taken was I have downloaded the caissabase chess database which is quite large and contains millions of chess games. My original plan was to simply read a random chess game from this file like so:

def extract_random_game(pgn_file, output_file):
    random_game = None
    num_games = 0

    with open(pgn_file) as pgn:
        while True:
            game = chess.pgn.read_game(pgn)
            if game is None:
                break
            num_games += 1
            if random.randint(1, num_games) == 1:
                random_game = game

    if num_games == 0:
        print("No games found in the PGN file.")
        return

    with open(output_file, 'w') as new_pgn:
        new_pgn.write(str(random_game))

    print("Random game extracted and saved to:", output_file)

However, this seems to take a long time to run and then it crashes and is hard to debug.

I've also tried extracting a game by its index, which seems to work well for low index numbers, but if I try to extract something like game #1000000, it takes a while to run as well:

def extract_game_by_index(pgn_file, game_index, output_file):
    game_counter = 0
    target_game = None

    with open(pgn_file) as pgn:
        while True:
            game = chess.pgn.read_game(pgn)
            if game is None:
                break

            game_counter += 1
            if game_counter == game_index:
                target_game = game
                break

    if target_game is None:
        print(f"Game not found at index {game_index} in the PGN file.")
        return

    with open(output_file, 'w') as new_pgn:
        new_pgn.write(str(target_game))

    print(f"Game at index {game_index} extracted and saved to:", output_file)

Any ideas on ways to modify this code or a different approach that I could take?

答案1

得分: 0

不需要翻译的部分已被省略。以下是翻译好的内容:

"Instead of reading the whole game, you can just read the game headers. According to the documentation (look at the chess.pgn.read_headers), it reduces the processing time for big files.

We can thus create a game picker only using headers. From the documentation, you have to retrieve the file offset before reading the header, in order to be able to read the full game later on. This solution returns the chess game from index (the first game has index 1):

def pick_game(pgn, index):
    offset = 0
    for _ in range(index):
        offset = pgn.tell()
        chess.pgn.read_headers(pgn)
    pgn.seek(offset)
    return chess.pgn.read_game(pgn)

Then using the same idea, you can recover the total number of game with:

def num_games(pgn):
    num = 0
    while chess.pgn.read_headers(pgn):
        num += 1
    return num

Finally, you should pick the random game index once and for all at the beginning with

total_num_games = num_games(open(filepath_to_pgn, "r"))
random_game_index = random.randint(1, total_num_games)
game = pick_game(open(filepath_to_pgn, "r"), random_game_index)
```"

<details>
<summary>英文:</summary>

Instead of reading the whole game, you can just read the game headers. According to the [documentation](https://python-chess.readthedocs.io/en/latest/pgn.html) (look at the `chess.pgn.read_headers`), it reduces the processing time for big files.

We can thus create a game picker only using headers. From the documentation,  you have to retrieve the file offset before reading the header, in order to be able to read the full game later on. This solution returns the chess game from index (the first game has index 1):
```python
def pick_game(pgn, index):
    offset = 0
    for _ in range(index):
        offset = pgn.tell()
        chess.pgn.read_headers(pgn)
    pgn.seek(offset)
    return chess.pgn.read_game(pgn)

Then using the same idea, you can recover the total number of game with:

def num_games(pgn):
    num = 0
    while chess.pgn.read_headers(pgn):
        num += 1
    return num

Finally, you should pick the random game index once and for all at the beginning with

total_num_games = num_games(open(filepath_to_pgn, &quot;r&quot;))
random_game_index = random.randint(1, total_num_games)
game = pick_game(open(filepath_to_pgn, &quot;r&quot;), random_game_index)

huangapple
  • 本文由 发表于 2023年6月19日 04:11:51
  • 转载请务必保留本文链接:https://go.coder-hub.com/76502366.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定