如何处理具有不可靠密码的数据库

huangapple go评论97阅读模式
英文:

How can I process a database with unreliable passwords

问题

I found a website haveibeenpwned that makes it possible to check if the password is compromised. A database with passwords is also provided. The database looks like a text file:

000000005AD76BD555C1D6D771DE417A4B87E4B4:10
00000000A8DAE4228F821FB418F59826079BF368:4
00000000DD7F2A1C68A35673713783CA390C9E93:873

There are several hundred million values encoded in SHA1, before that the values were converted to uppercase.

I found this:

  public static String generateSHA1(String message) throws RuntimeException {
        return hashString(message, "SHA-1");
    }

    private static String hashString(String message, String algorithm)
            throws RuntimeException {
        try {
            MessageDigest digest = MessageDigest.getInstance(algorithm);
            byte[] hashedBytes = digest.digest(message.getBytes("UTF-8"));

            return convertByteArrayToHexString(hashedBytes);
        } catch (NoSuchAlgorithmException | UnsupportedEncodingException ex) {
            throw a RuntimeException(
                    "Could not generate hash from String", ex);
        }
    }

    private static String convertByteArrayToHexString(byte[] arrayBytes) {
        StringBuilder stringBuffer = new StringBuilder();
        for (byte arrayByte : arrayBytes) {
            stringBuffer.append(Integer.toString((arrayByte & 0xff) + 0x100, 16)
                    .substring(1));
        }
        return stringBuffer.toString();
    }

Should I bring my password (which should be compared with the hashes table) after converting to hash, to a HEX string?

Is this enough to be able to perform a comparison?

However, I don't understand how to check such a volume:

  • After the ':' character, what is this number indicated?
  • To verify, do I have to translate the password value to uppercase and encode using Message Digest with the SHA1 type?
  • However, if I get a hash, then I will have to perform a hash comparison search across the entire database.
  • And is it possible to somehow allocate the prefix of the hash string and make a selection of suitable values and search among their range using Java, which will reduce the load on the database and reduce the response of the application?

If anyone has any ideas how to do this and what is this hash in the file, please share.

英文:

I found a website haveibeenpwned that makes it possible to check if the password is compromised. A database with passwords is also provided. The database looks like a text file

000000005AD76BD555C1D6D771DE417A4B87E4B4:10
00000000A8DAE4228F821FB418F59826079BF368:4
00000000DD7F2A1C68A35673713783CA390C9E93:873

There are several hundred million values encoded in SHA1, before that the values were converted to uppercase.

I found this:

  public static String generateSHA1(String message) throws RuntimeException {
        return hashString(message, "SHA-1");
    }

    private static String hashString(String message, String algorithm)
            throws RuntimeException {
        try {
            MessageDigest digest = MessageDigest.getInstance(algorithm);
            byte[] hashedBytes = digest.digest(message.getBytes("UTF-8"));

            return convertByteArrayToHexString(hashedBytes);
        } catch (NoSuchAlgorithmException | UnsupportedEncodingException ex) {
            throw new RuntimeException(
                    "Could not generate hash from String", ex);
        }
    }

    private static String convertByteArrayToHexString(byte[] arrayBytes) {
        StringBuilder stringBuffer = new StringBuilder();
        for (byte arrayByte : arrayBytes) {
            stringBuffer.append(Integer.toString((arrayByte & 0xff) + 0x100, 16)
                    .substring(1));
        }
        return stringBuffer.toString();
    }

Should I bring my password (which should be compared with the hashes table) after converting to hash, to a HEX string ?

Is this enough to be able to perform a comparison?

However, I don't understand how to check such a volume :

  • After the ':' character, what is this number indicated ?
  • To verify, do I have to translate the password value to uppercase and encode using Message Digest with the SHA1 type ?
  • However, if I get a hash, then I will have to perform a hash comparison search across the entire database.
  • And is it possible to somehow allocate the prefix of the hash string and make a selection of suitable values and search among their range using Java, which will reduce the load on the database and reduce the response of the application?

If anyone has any ideas how to do this and what is this hash in the file, please share?

答案1

得分: 4

根据解释所有这些的博客文章:5 表示这个密码分别与不同的帐户信息出现了 5 次。那个带有 873 的建议这是一个相对常用的密码,在这种意义上是“更糟糕”的。

我认为密码是区分大小写的。HIBP提供的API具有不区分大小写的哈希搜索,即如果您搜索000000005ad76BD555C1D6D771DE417A4B87E4B4,它会像000000005AD76BD555C1D6D771DE417A4B87E4B4一样找到结果。"不区分大小写"和"哈希它"不兼容。为了进行非常快速的测试,检查SHA1(iloveyou)和SHA1(ILOVEYOU)是否在数据库中。如果两者都在,那么您已经得到了确认:不要改变大小写。如果只有一个在其中,大写(或小写)后再进行搜索。

验证密码的适用性如下:

获取用户的密码。不要改变它的大小写。对它进行SHA1哈希。检查该SHA1是否在这个文件中(冒号之前的部分)。如果是,告诉用户选择其他密码,并链接到通常的。特别是如果冒号后面的数字很高,那就更糟糕,但只要出现在这个列表中,就有理由拒绝该密码。那个:1234的部分主要是用于其他类型的研究。

但是,如果我得到一个缓存,那么我将不得不在整个数据库上执行哈希比较搜索。

是的,你认为这会怎么工作?你可以使用他们的API,即使跨站点使用(每个IP每1.5秒限制一个请求)。是的,如果你愿意,你的更改密码/创建新帐户表单可以直接与HIBP进行通信,不需要涉及到你的服务器。HIBP网站的API文档解释了所有这些以及如何操作。然后,他们将执行搜索。

就其价值而言,只要你正确地执行,搜索所有这些并不太困难,也就是说,进行了索引。例如,二进制搜索(我相当确定该文件在SHA1上是“排序的”,因此查找哈希的方式就像查找电话簿中某人的姓名的电话号码一样:不要只是打开电话簿,然后从Mr. Aabarth开始阅读,而是去到中间,检查你在那里找到的名字是在你要找的名字的“下面”还是“上面”。如果“下面”,然后取出电话簿的下半部分,扔进垃圾桶。现在你只剩下一半的电话簿要搜索。

继续应用该算法:每次将一半的东西扔进垃圾桶。这是一个LOG2(n)算法,因此总共需要大约32个步骤(对“丢掉一半的电话簿”应用32次将使你在电话簿中只有40亿条目的情况下最终找到单个条目。LOG2(n)相当不错)。

数据库可以自动执行这个操作(或者甚至可以做得更好),只要你给该列添加一个索引。

如果你实际上自己编写这个功能,最好构建一个索引 - 比如,100k个哈希及其在巨大文件中的确切位置。然后查找哈希,检查你的索引是否有那个哈希位于它之下但尽可能接近的位置(你可以在Java中使用TreeSet来实现这个,例如)。打开文件,定位到那个位置,然后开始处理,直到你要么[A]找到那个哈希(并拒绝该密码作为不合适的密码),要么[B]找到一个哈希(位于你的哈希之上),在这种情况下,密码可能是可以使用的。鉴于那里大约有40亿个密码,你现在已将搜索限制为仅在40000个哈希中扫描文件,这将几乎是瞬间完成的,速度会比在整个文件上执行完整的二进制搜索要快(即使使用SSD,跳转磁盘扇区通常更昂贵),并且避免了首先定位到换行符以避免尝试扫描哈希中间的内容的困难。

1 "they" 是指Troy Hunt,他基本上是以慈善方式运营HIBP。如果你使用HIBP,请考虑捐赠支持这个事业!

英文:

As per the blog post that explains all this, the :5 is telling you that this password showed up with separate account information 5 separate times. That one with 873 suggests it's a somewhat commonly used password and in that sense 'worse'.

I don't think the passwords are case insensitive. The API that HIBP provides has case insensitive hash searches, i.e. if you search for 000000005ad76BD555C1D6D771DE417A4B87E4B4, it finds a hit just as well as 000000005AD76BD555C1D6D771DE417A4B87E4B4. "Case insensitive" and 'hash it' are not compatible. For a very quick test, check if SHA1(iloveyou) and SHA1(ILOVEYOU) are in the database. if both are, you have your confirmation: Don't mess with cases. If only one is, uppercase (or lowercase) before searching.

Verifying a password's suitability is done as follows:

Take the user's password. Don't mess with its case. SHA1 it. Check if that SHA1 is in this file (the part before the colon). If yes, tell the user to go pick something else and link em to the usual. Especially if the number following the colon is high, that's worse, but appearing at all in this list is grounds to deny that password. That :1234 stuff is mostly for other kinds of research.

> However, if I get a cache, then I will have to perform a hash comparison search across the entire database.

Yes; how else did you think this was going to work? Instead of the 6GB or whatever it is file, you can use their API, even cross-site (each IP is rate limited to one request per 1.5 seconds). Yes, your change pass/create new account form can directly ping HIBP with absolutely no involvement from your server, if you want. The HIBP site's API docs explain all this and how to do that. Then they<sup>1</sup> will do a search.

For what its worth, searching all that isn't too difficult assuming you do it correctly, that is, indexed. For example, a binary search (I'm pretty sure that file is 'sorted' on SHA1, so look up hashes the way you look up a phone number in a phone book based on somebody's name: Don't just open the book and start reading from Mr. Aabarth, go to the exact middle, check if the name you find there is 'below' or 'above' what you are looking for. If 'below', then take the bottom half of the phonebook, tear it off, toss it in the garbage. You now have half a phonebook to search.

Continue to apply that algorithm: Each time you toss half of what is left in the garbage. That's a LOG2(n) algorithm and therefore will e.g. process in something like 32 steps total ('get rid of half the phonebook' applied 32 times will get you down to a single entry even for a phonebook that has 4 billion entries in it. LOG2(n) is pretty neat).

dbs do this automatically (or can do even better than that) assuming you add an index to the column.

If you actually write this yourself it is smart to build up an index - say, 100k hashes and their exact position in the giant file. To then look up a hash, check your index for that hash that is 'below' your hash but as close to it as possible (you can use TreeSet in java for this for example). Open the file, seek to that position, and then start processing until you either [A] find that hash (and deny that password as unsuitable), or [B] you find a hash that is 'higher' (sorts above the hash you have), in which case the password is presumably allright to use. Given that there are something like 4 billion passwords in there, you've now reduced the search to scanning through the file for only 4 billion / 100k = 40,000 hashes, which is going to be near instant, and would be faster than doing a full binary search on the entire file (as hopping around disk sectors is usually pricier even with SSDs), and avoids the difficulty of first seeking to a newline to avoid trying to scan for stuff in the middle of a hash.

1 they is referring to Troy Hunt who basically runs HIBP as a charity. Consider donating to the cause if you use HIBP!

答案2

得分: 0

以下是已翻译的代码部分:

public class Main {

    public static void main(String[] args) {

        String password = "qwerty";

        String pwd = generatePasswordIntoSHA1Algorithm(password);

        System.out.println(pwd);
    }


    public static String generatePasswordIntoSHA1Algorithm(String password) throws RuntimeException {
        return convertPasswordIntoHashString(password, "SHA-1");
    }

    private static String convertPasswordIntoHashString(String message, String algorithm)
            throws RuntimeException {
        try {
            MessageDigest digest = MessageDigest.getInstance(algorithm);

            byte[] messageInBytes = message.getBytes(StandardCharsets.UTF_8);

            byte[] messageInBytesHashed = digest.digest(messageInBytes);

            return convertByteArrayToHexString(messageInBytesHashed);

        } catch (NoSuchAlgorithmException ex) {
            throw new RuntimeException(
                    "Could not generate hash from String", ex);
        }
    }

    private static String convertByteArrayToHexString(byte[] arrayBytes) {

        StringBuilder stringBuffer = new StringBuilder();

        for (byte arrayByte : arrayBytes) {

            stringBuffer.append(
                    convertByteToHexFormat(arrayByte)
            );
        }
        return stringBuffer.toString();
    }

    private static String convertByteToHexFormat(byte arrayByte) {
        return String.format("%02X", arrayByte);
    }
}
英文:

Here is a set of methods that helped me convert a plain-text password into a hash, via the SHA-1 algorithm. after that, I compared passwords with an offline database of compromised passwords.

public class Main {

    public static void main(String[] args) {

        String password = &quot;qwerty&quot;;

        String pwd = generatePasswordIntoSHA1Algorithm(password);

        System.out.println(pwd);
    }


    public static String generatePasswordIntoSHA1Algorithm(String password) throws RuntimeException {
        return convertPasswordIntoHashString(password, &quot;SHA-1&quot;);
    }

    private static String convertPasswordIntoHashString(String message, String algorithm)
            throws RuntimeException {
        try {
            MessageDigest digest = MessageDigest.getInstance(algorithm);

            byte[] messageInBytes = message.getBytes(StandardCharsets.UTF_8);

            byte[] messageInBytesHashed = digest.digest(messageInBytes);

            return convertByteArrayToHexString(messageInBytesHashed);

        } catch (NoSuchAlgorithmException ex) {
            throw new RuntimeException(
                    &quot;Could not generate hash from String&quot;, ex);
        }
    }

    private static String convertByteArrayToHexString(byte[] arrayBytes) {

        StringBuilder stringBuffer = new StringBuilder();

        for (byte arrayByte : arrayBytes) {

            stringBuffer.append(
                    convertByteToHexFormat(arrayByte)
            );
        }
        return stringBuffer.toString();
    }

    private static String convertByteToHexFormat(byte arrayByte) {
        return String.format(&quot;%02X&quot;, arrayByte);
    }
}

huangapple
  • 本文由 发表于 2023年6月1日 01:08:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/76375868.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定