如何在Python中生成10位唯一标识符?

huangapple go评论61阅读模式
英文:

How to generate 10 digit unique-id in python?

问题

我想在Python中生成一个10位的唯一标识符。我尝试了以下方法,但都没有成功

  • get_random_string(10) -> 它生成具有碰撞概率的随机字符串
  • str(uuid.uuid4())[:10] -> 由于我只取前缀,它也具有碰撞概率
  • 顺序 -> 我可以通过添加0来生成唯一的顺序ID,但它将是顺序的且更容易猜测。所以我想避免顺序ID

我们有没有适当的系统来生成10位唯一标识符?

英文:

I want to generate 10 digit unique-id in python. I have tried below methods but no-one worked

  • get_random_string(10) -> It generate random string which has probability of collision
  • str(uuid.uuid4())[:10] -> Since I am taking a prefix only, it also has probability of collision
  • sequential -> I can generate unique sequential ID by prepending 0s but it will be sequential and easier to guess. So I want to avoid sequential ids

Do we have any proper system to generate 10 digit unique-id?

答案1

得分: 1

取决于您需要多少个唯一标识符,您创建它们的速度,它们是否只需在您的计算机上唯一,还是需要在多台计算机上唯一,所需字符(仅限数字、字母数字混合等),您想要允许哪些字符... 生成唯一标识符并不是一项容易的任务。

一些想法:

  • 最简单的方法(当然不随机)是从 0000000000 开始递增,一直到 zzzzzzzzzz

  • 最安全的方法(但可能最慢)是存储生成的标识符,当您生成新的标识符时,检查它是否已经存在...

  • 基于当前时间戳 + 一些内部增量(对于在相同时间戳生成的标识符,取决于时间戳的分辨率) + 可选的机器配置(如果您需要它在多台机器之间是唯一的) + 一些随机性 + 一些打乱。

但是,您在标识符中引入的随机性越多,您对输出的控制就越小,因此您需要检查冲突。特别是如果生成的标识符数量大于可能的标识符数量时。

因此,最终有两种确保唯一性的方式(在特定范围内):

  1. 检查新创建的标识符以前是否已经发出。
  2. 由一个无冲突函数计算,并且您可以确保此函数的输入不会重复使用(例如,在生成每个标识符后递增输入)。
英文:

Depends on how many of those you need, how fast you are creating them, if they have to be unique only on your machine or for multiple machines, what characters (digits only, alphanumeric, ...), you want to allow, ... Generating unique IDs is not an easy task.

A few ideas:

  • The easiest (but of course not random at all) way is starting at 0000000000 and going up all the way up to zzzzzzzzzz by just incrementing.

  • The safest way (but probably the slowest) is storing the generated IDs, and when you generate a new one, check if it already exists ...

  • Base your ID on the current timestamp + some internal incrementor (for ids generated at the same timestamp, depending on the resolution of your timestamp) + optionally machine configuation if you need it to be unique among multiple machines + some random + some shuffling.

But the more randomness you put into your ID, the less control you have on the output, thus you need to check for collisions. Especially if the number of generated IDs is large, with respect to the number of possible IDs.

So in the end of the day, there are only two ways to guarantee uniqueness (within a certain scope)

  1. check if the newly created id was already issued before
  2. calculated by a collision-free function, and you are in control that the input to this function cannot be used twice (for instance by incrementing the input after each id you generate)

答案2

得分: -1

尝试使用 [tag:hashlib] 模块,如下所示:

import hashlib
import random

def generate_id():
    # 生成一个随机数
    random_num = str(random.randint(0, 99999999)).encode()

    # 生成随机数的 SHA-256 哈希值。
    hash_obj = hashlib.sha256(random_num)
    hex_digit = hash_obj.hexdigest()

    return hex_digit[:10]
英文:

Try to use [tag:hashlib] module so:

import hashlib
import random

def generate_id():
    # Generate a random number
    random_num = str(random.randint(0, 99999999)).encode()

    # Generate a SHA-256 hash of the random number.
    hash_obj = hashlib.sha256(random_num)
    hex_digit = hash_obj.hexdigest()

    return hex_digit[:10]

huangapple
  • 本文由 发表于 2023年2月19日 16:33:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/75498889.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定