如何在Java中生成一个由5个字符组成的唯一字母数字值?

huangapple go评论62阅读模式
英文:

How to generate a 5 character unique alphanumeric value in java?

问题

我在一个银行项目中工作,他们的要求是为每笔交易生成唯一的交易参考号(UTR)。UTR的格式为:

<银行代码><YYDDD><5位数字序列号>

这个5位数字序列号也可以是字母数字混合的。每天的交易次数可能会达到10-20万次。

如果我使用Oracle序列,那么我只能拥有1万个值。

我尝试使用SecureRandom生成器生成了20万个长度为5的字符串,但其中大约有30个重复的字符串。

以下是我使用的代码片段:

int leftLimit = 48;
int rightLimit = 122;
int i1 = 0;
Random random = new SecureRandom();
while (i1 < 200000) {
    String generatedString = random.ints(leftLimit, rightLimit + 1)
                                   .filter(i -> (i <= 57 || i >= 65) && (i <= 90 || i >= 97))
                                   .limit(5)
                                   .collect(StringBuilder::new,
                                            StringBuilder::appendCodePoint,
                                            StringBuilder::append)
                                   .toString();
    System.out.println(generatedString);
    i1++;
}
英文:

I work for a banking project and their requirement is to generate unique transaction reference for each transaction. The format for UTR is:

&lt;BankCode&gt;&lt;YYDDD&gt;&lt;5 digit SequenceId&gt;.

This 5 digit sequence ID can be alphanumeric as well. The transaction count each day can go up to 100-200K.

If I use an Oracle sequence then I can have only 10K values.

I tried to use SecureRandom generator and generated 200K 5 length string but it generated around 30 duplicate strings.

Below is the code snippet I used

int leftLimit = 48;
int rightLimit = 122;
int i1=0;
Random random = new SecureRandom();
while (i1&lt;200000) {
    String generatedString = random.ints(leftLimit, rightLimit+1)
                                   .filter(i -&gt; (i&lt;=57||i&gt;=65) &amp;&amp; ( i&lt;=90|| i&gt;=97))
                                   .limit(5)
                                   .collect(StringBuilder::new,
                                            StringBuilder::appendCodePoint,
                                            StringBuilder::append)
                                   .toString();
    System.out.println(generatedString);
    i1++;
}

答案1

得分: 0

如果您想要一个伪随机序列,我建议您使用自定义的费斯特尔(Feistel)实现。费斯特尔被设计为一种可逆机制,因此您可以通过重新应用它来解码费斯特尔,这意味着 i == feistel(feistel(i)),如果您从1变化到X,您将获得介于1和X之间的所有数字,且不会发生碰撞。

基本上,您有36个字符可供使用。因此您有60,466,176个可能的值,但您只想要其中的200,000个。但实际上,我们不关心您想要多少个,因为费斯特尔确保没有碰撞。

您会注意到,60,466,176在二进制中表示为 0b11100110101010010000000000,这是一个26位的数字。26位对于代码来说不太友好,因此我们将自定义费斯特尔映射器调整为24位。由于费斯特尔需要将数字分为两个部分,每个部分将是12位。这只是为了解释下面代码中的值,如果您查看其他实现,可以看到12而不是16。此外,0xFFF 是用于12位的掩码。

现在是算法本身:

  static int feistel24bits(int value) {
    int l1 = (value >> 12) & 0xFFF;
    int r1 = value & 0xFFF;
    for (int i = 0; i < 3; i++) {
      int key = (int)((((1366 * r1 + 150889) % 714025) / 714025d) * 0xFFF);
      int l2 = r1;
      int r2 = l1 ^ key;
      l1 = l2;
      r1 = r2;
    }
    return (r1 << 12) | l1;
  } 

因此,基本上,这意味着如果您给定这个算法介于 016777215(即 2^24-1)之间的任何数字,您将获得一个独特的伪随机数字,该数字在以36为底的基数中写入时可以适应一个5字符的字符串。

那么如何让它工作?嗯,非常简单:

String nextId() {
  int sequence = (retrieveSequence() + 1) & 0xFFFFFF; // 限制为24位
  int nextId = feistel24bits(sequence);
  storeSequence(sequence);
  return intIdToString(nextId);
}
static String intIdToString(int id) {
  String str = Integer.toString(id, 36);
  while(str.length() < 5) { str = "0" + str; }
  return str;
}

这里是我使用的完整代码。

英文:

If you want a pseudo-random sequence, I suggest you use a custom Feistel implementation. Feistel is designed to be a reciprocal mechanism, so you can decode Feistel by reapplying it, meaning that i == feistel(feistel(i)) and if you go from 1 to X you will get all numbers between 1 and X exactly once. So no collision.

Basically, you have 36 characters at your disposal. So you have 60,466,176 possible values, but you want only 200,000 of them. But actually, we don't care how many you want because Feistel ensures that there are NO collisions.

You'll notice that 60,466,176 in binary is 0b11100110101010010000000000, that's a 26 bits number. 26 isn't very friendly for the code so we'll wrap our custom feistel mapper to 24 bits instead. Feistel having to split a number in two even parts, each part will be 12 bits. This is only to explain the values you'll see in the code below, which is the 12 instead of 16 if you look at other implementations. Also, the 0xFFF is the mask for 12 bits.

Now the algorithm itself:

  static int feistel24bits(int value) {
    int l1 = (value &gt;&gt; 12) &amp; 0xFFF;
    int r1 = value &amp; 0xFFF;
    for (int i = 0; i &lt; 3; i++) {
      int key = (int)((((1366 * r1 + 150889) % 714025) / 714025d) * 0xFFF);
      int l2 = r1;
      int r2 = l1 ^ key;
      l1 = l2;
      r1 = r2;
    }
    return (r1 &lt;&lt; 12) | l1;
  } 

So basically, this means that if you give this algorithm any number between 0 and 16777215 ( = 2<sup>24</sup>-1), you'll get a unique, pseudo-random number that could fit in a 5 character string when written in base-36.

So how do you get it all working? Well, it's very simple:

String nextId() {
  int sequence = (retrieveSequence() + 1) &amp; 0xFFFFFF; // bound to 24 bits
  int nextId = feistel24bits(sequence);
  storeSequence(sequence);
  return intIdToString(nextId);
}
static String intIdToString(int id) {
  String str = Integer.toString(id, 36);
  while(str.length() &lt; 5) { str = &quot;0&quot; + str; }
  return str;
}

Here's the full code that I used.

答案2

得分: -1

以下是翻译好的内容:

方法一:使用Set存储唯一值,使用后从集合中移除

class UniqueIdGenerator {
    private static final int CODE_LENGTH = 5;
    private static final int RANGE = (int) Math.pow(36, CODE_LENGTH); 
    private final Random random = new SecureRandom();
    private final int initSize;
    private final Set<String> memo = new HashSet<>();

    public UniqueIdGenerator(int size) {
        this.initSize = size;
        generate();
    }

    private void generate() {
        int dups = 0;
        while (memo.size() < initSize) {
            String code = Formatter.padZeros(Integer.toString(random.nextInt(RANGE), 36), CODE_LENGTH);

            if (memo.contains(code)) {
                dups++;
            } else {
                memo.add(code);
            }
        }
        System.out.println("重复出现次数:" + dups);
    }

    public String getNext() {
        String code = memo.iterator().next();
        memo.remove(code);
        return code;
    }   
}

方法二:使用具有随机“起始”和随机增量的序列

class RandomSequencer {
    private static final int CODE_LENGTH = 5;
    private Random random = new SecureRandom();
    private int start = random.nextInt(100_000);

    public String getNext() {
        String code = Formatter.padZeros(Integer.toString(start, 36), CODE_LENGTH);
        start += random.nextInt(300) + 1;

        return code;
    }
}

更新:
添加零的填充可以以多种方式实现:

class Formatter {
    private static String[] pads = {"", "0", "00", "000", "0000"};
    public static String padZeros(String str, int maxLength) {
        if (str.length() >= maxLength) {
            return str;
        }
        return pads[maxLength - str.length()] + str;
    }

    private static final String ZEROS = "0000";
    public static String padZeros2(String str, int maxLength) {
        if (str.length() >= maxLength) {
            return str;
        }
        return ZEROS.substring(0, maxLength - str.length()) + str;
    }

    public static String padZeros3(String str, int maxLength) {
        if (str.length() >= maxLength) {
            return str;
        }
        return String.format("%1$" + maxLength + "s", str).replace(" ", "0");
    }
}
英文:

There seem to be 2 approaches:

  1. Store unique values into a Set of required size and remove its elements upon use.
class UniqueIdGenerator {
    private static final int CODE_LENGTH = 5;
    private static final int RANGE = (int) Math.pow(36, CODE_LENGTH); 
    private final Random random = new SecureRandom();
    private final int initSize;
    private final Set&lt;String&gt; memo = new HashSet&lt;&gt;();
    
    public UniqueIdGenerator(int size) {
        this.initSize = size;
        generate();
    }
    
    private void generate() {
        int dups = 0;
        while (memo.size() &lt; initSize) {
            String code = Formatter.padZeros(Integer.toString(random.nextInt(RANGE), 36), CODE_LENGTH);
            
            if (memo.contains(code)) {
                dups++;
            } else {
                memo.add(code);
            }
        }
        System.out.println(&quot;Duplicates occurred: &quot; + dups);
    }
    
    public String getNext() {
        String code = memo.iterator().next();
        memo.remove(code);
        return code;
    }   
}
  1. Use a sequence with random start and random increments.
class RandomSequencer {
    private static final int CODE_LENGTH = 5;
    private Random random = new SecureRandom();
    private int start = random.nextInt(100_000);
    
    public String getNext() {
        String code = Formatter.padZeros(Integer.toString(start, 36), CODE_LENGTH);
        start += random.nextInt(300) + 1;
        
        return code;
    }
}

Update
Padding of zeros may be implemented in a variety of ways:

class Formatter {
    private static String[] pads = {&quot;&quot;, &quot;0&quot;, &quot;00&quot;, &quot;000&quot;, &quot;0000&quot;};
    public static String padZeros(String str, int maxLength) {
        if (str.length() &gt;= maxLength) {
            return str;
        }
        return pads[maxLength - str.length()] + str;
    }

    private static final String ZEROS = &quot;0000&quot;;
    public static String padZeros2(String str, int maxLength) {
        if (str.length() &gt;= maxLength) {
            return str;
        }
        return ZEROS.substring(0, maxLength - str.length()) + str;
    }

    public static String padZeros3(String str, int maxLength) {
        if (str.length() &gt;= maxLength) {
            return str;
        }
        return String.format(&quot;%1$&quot; + maxLength + &quot;s&quot;, str).replace(&quot; &quot;, &quot;0&quot;);
    }
}

答案3

得分: -1

因为你在问题中提到了Oracle,你是否考虑使用PL/SQL解决方案?

  1. 创建一个数据库表来存储你的序列号。
create table UTR (
  BANK_CODE     number(4)
 ,TXN_DATE_STR  char(5)
 ,SEQUENCE_ID   char(5)
 ,USED_FLAG     char(1)
 ,constraint USED_FLAG_VALID check (USED_FLAG in ('N', 'Y'))
 ,constraint UTR_PK primary key (BANK_CODE, TXN_DATE_STR, SEQUENCE_ID)
)
  1. 创建一个PL/SQL过程来填充表格。
create or replace procedure POPULATE_UTR
is
  L_COUNT     number(6);
  L_BANK      number(4);
  L_DATE_STR  char(5);
  L_SEQUENCE  varchar2(5);
begin
  L_BANK := 3210;
  select to_char(sysdate, 'YYDDD')
    into L_DATE_STR
    from DUAL;
  L_COUNT := 0;
  while L_COUNT < 200000 loop
    L_SEQUENCE := '';
    for K in 1..5 loop
      L_SEQUENCE := L_SEQUENCE || substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789',
                                         mod(abs(dbms_random.random), 62) + 1,
                                         1);
    end loop;
    begin
      insert into UTR values (L_BANK, L_DATE_STR, L_SEQUENCE, 'N');
      L_COUNT := L_COUNT + 1;
    exception
      when dup_val_on_index then
        null; -- 忽略。
    end;
  end loop;
end;

需要注意的是,生成序列号的代码来源于这个标题为“Generate Upper and Lowercase Alphanumeric Random String in Oracle”的Stack Overflow问题。

另外,获取日期字符串的代码来源于这个标题为“Oracle Julian day of year”的Stack Overflow问题。

现在,由于数据库表UTR中的每一行都包含一个唯一的序列号,你可以选择第一行,其中USED_FLAG等于N

select SEQUENCE_ID
  from UTR
 where BANK_CODE = 1234 -- 即相关的银行代码
   and TXN_DATE_STR = 'whatever is relevant'
   and USED_FLAG = 'N'
   and rownum < 2

供你参考,如果你想从表格UTR中随机选择一行,可以参考这个标题为“How to get records randomly from the oracle database?”的Stack Overflow问题。

在你使用了序列号之后,你需要更新表格并将USED_FLAG设置为Y,例如

update UTR
   set USED_FLAG = 'Y'
 where BANK_CODE = 1234 -- 与你在选择中使用的一致
   and TXN_DATE_STR = '与你在选择中使用的一致'
   and SEQUENCE_ID = '与选择返回的值一致'
英文:

Since you mentioned Oracle in your question, would you consider a PL/SQL solution?

  1. Create a database table to hold your sequence IDs.
create table UTR (
  BANK_CODE     number(4)
 ,TXN_DATE_STR  char(5)
 ,SEQUENCE_ID   char(5)
 ,USED_FLAG     char(1)
 ,constraint USED_FLAG_VALID check (USED_FLAG in (&#39;N&#39;, &#39;Y&#39;))
 ,constraint UTR_PK primary key (BANK_CODE, TXN_DATE_STR, SEQUENCE_ID)
)
  1. Create a PL/SQL procedure to populate the table.
create or replace procedure POPULATE_UTR
is
  L_COUNT     number(6);
  L_BANK      number(4);
  L_DATE_STR  char(5);
  L_SEQUENCE  varchar2(5);
begin
  L_BANK := 3210;
  select to_char(sysdate, &#39;YYDDD&#39;)
    into L_DATE_STR
    from DUAL;
  L_COUNT := 0;
  while L_COUNT &lt; 200000 loop
    L_SEQUENCE := &#39;&#39;;
    for K in 1..5 loop
      L_SEQUENCE := L_SEQUENCE || substr(&#39;abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789&#39;,
                                         mod(abs(dbms_random.random), 62) + 1,
                                         1);
    end loop;
    begin
      insert into UTR values (L_BANK, L_DATE_STR, L_SEQUENCE, &#39;N&#39;);
      L_COUNT := L_COUNT + 1;
    exception
      when dup_val_on_index then
        null; -- ignore.
    end;
  end loop;
end;

Note that the code for generating the sequence ID came from this SO question with the title Generate Upper and Lowercase Alphanumeric Random String in Oracle

Also, the code for obtaining the date string came from this SO question entitled Oracle Julian day of year

Now, since each row in database table UTR contains a unique sequence ID, you can select the first row where USED_FLAG equals N

select SEQUENCE_ID
  from UTR
 where BANK_CODE = 1234 -- i.e. whatever the relevant bank code is
   and TXN_DATE_STR = &#39;whatever is relevant&#39;
   and USED_FLAG = &#39;N&#39;
   and rownum &lt; 2

For your information, if you want to select a random row from table UTR instead, refer to this SO question entitled How to get records randomly from the oracle database?

After you use that sequence ID, you update the table and set the USED_FLAG to Y, i.e.

update UTR
   set USED_FLAG = &#39;Y&#39;
 where BANK_CODE = 1234 -- what you used in the select
   and TXN_DATE_STR = &#39;what you used in the select&#39;
   and SEQUENCE_ID = &#39;what was returned by the select&#39;

huangapple
  • 本文由 发表于 2020年9月26日 14:41:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/64074677.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定