英文:
How do I send a password without using String from java to postgresql?
问题
我知道在处理密码时,char[]
相对于String
更可取,因为String
是不可变的。但我一直没有弄清楚如何将其传递给预处理语句。考虑以下代码:
Connection con = MyDBManager.connect();
PreparedStatement stm = con.prepareStatement(
"INSERT INTO my_table(username, password) VALUES(?, ?)");
String username = "Jonny";
stm.setString(username);
char[] password = new char[] {'a', 'b', 'c', 'd'};
stm.setString(password.toString());
这段代码可以工作,但我怀疑调用password.toString()
会使使用char[]
的初衷失效,那么我应该如何做呢?
是的,我知道密码不应以明文存储。密码已经进行了哈希处理,但我仍然希望能够利用char[]
的优势。
以下方法似乎可以工作,我还将数据库中的字段从text
更改为bytea
,并且(我认为)在理论上应该具有与所期望的char[]
相同的优势,但我强烈感觉我操作得不对:
byte[] password = new byte[] {'a', 'b', 'c', 'd'};
stm.setBytes(password);
也许我的方法完全错误,但我的要求是密码哈希绝不能存在于不可变对象中。我可以接受除了PreparedStatement
之外的其他将其发送到数据库的方法。
对评论的回答:
我从 @rzwitserloot 的回答中学到了很多,但不幸的是它并没有帮助。绝不在不可变对象中保留密码哈希是一个不可协商的要求。
英文:
I know that char[]
is preferable over String
when it comes to passwords, due to the fact that String
is immutable. But I have not been able to figure out how to pass it to a prepared statement. Consider this code:
Connection con = MyDBManager.connect();
PreparedStatement stm = con.prepareStatement(
"INSERT INTO my_table(username, password) VALUES(? ?)");
String username = "Jonny";
stm.setString(username);
char[] password = new char[] {'a','b','c','d'};
stm.SetString(password.toString());
It works, but I suspect calling password.toString()
defeats the whole purpose of using a char[]
, so how should I do?
And yes, I know that a password should not be stored in clear text. The password is hashed, but I still want to use the benefits of a char[]
This seemed to work, and I also changed the field in the database from text
to bytea
and (I think) should in theory give the same benefits as the char[]
is supposed to, but I get a strong feeling that I'm not doing it right:
byte[] password = new byte[] {'a','b','c','d'};
stm.SetBytes(password);
Maybe I'm completely wrong in my approach, but my requirements are that the password hash should never exist in an immutable object. I am open to other ways of sending it to the database than PreparedStatement
Answer to comments:
I have learned a lot from @rzwitserloot's answer, but unfortunately it does not help. Never having the password hash in an immutable object is a non-negotiable requirement.
答案1
得分: 6
为什么密码以char[]形式存在?
> 我知道在处理密码时,char[]优于String,因为String是不可变的。
如果你的意思是:所以现在你无法更改密码。那是不正确的。
也许你指的是这样一个事实:你无法彻底抹去它们,但以防万一你不是这样,让我解释一下:
只有一个原因使char[]
优于String,并且这是非常可疑的:通过char[]
,你可以将数组“抹掉”。也就是说,一旦你的进程(你的Java代码)不再需要密码,你可以明确地运行Arrays.fill(thePassword, (char) 0);
,现在RAM不再包含密码。你无法对字符串做到这一点 - 你可以将引用置空,但那只是“抹去宝藏地图”。这不是挖出宝藏并将其粉碎的过程。愿意挖掘整个海滩的人仍然会找到它。
这听起来很棒,也是解释为什么许多基于密码的API使用char[]
而不是String
的唯一解释。然而,作为一个原则,这是极其可疑的,你绝对不应该依赖于它:
- 如果一些其他您无法信任的进程可以访问您进程的内存,那么您已经相当糟糕了。您应该从源头解决这个问题,而不是试图通过减少曝光来缓解此问题。这就好像在动脉上有一个巨大的大洞,并通过连接恒定的输血袋(对抗症状)来修复它,而不是用绷带包扎伤口(关闭实际洞口)。如果不知何故无法关闭洞口,我猜症状对抗总比没有好,但这是一个不良的选择。
- 在操作系统和CPU缓存之间,您无法保证清除char数组可以保证密码在黑客可以合理获取的任何硬件部分中不存在。
值得注意的是,我不会认为将字符串存储在密码中的代码是不安全的。事实上,我担心将它们存储在char[]形式中的代码 - 我担心作者会误解这种保护,或者他们会直接忘记将其清零:这让代码的读者产生了错误的假设(即密码已被清除,这可能不是真的,以及这意味着如果有任何东西成功获取了内存内容的转储,密码就无法恢复,这也可能不是真的)。
注:在安全性方面,编写詹姆斯·邦德级别的剧本是一个好主意。这是一个剧本示例:
您在云托管环境中运行服务器,在虚拟化PC上。该云的操作员搞砸了,删除服务器时(实际上只是终止在运行数百台PC的主机上运行的虚拟PC的操作),他们不会抹去进程的内存。有人决定尝试滥用此功能:他们要求云主机为他们提供带有Linux的PC,然后安装一个简单的应用程序,扫描虚拟PC自己的内存,寻找任何看起来像密码的东西并报告,然后关闭该机器并终止它,并请求另一个机器。
通过使用char[]
并擦除密码,您可以更安全地对抗这种情况... 但是如果服务器崩溃或被强制关闭(例如,这正是Amazon EC2的预期使用方式,参见Netflix的混沌猴文档,大多数社区普遍认为这是正确的做法),那么某些密码可能刚好处于char[]阶段。更不用说这些密码可能存在于内存的所有其他位置。
看到邦德级别的剧本如何帮助澄清问题了吗?它显示出抹去char数组存在一些意义,但它并不是绝对可靠的。
如何在Postgres中存储密码
哈希也不好。至少,这取决于您对“哈希”一词的理解。
正确的方法涉及两个方面:
- 一个“盐” - 问题是,很多人将
iloveyou
作为密码。无论您使用什么哈希算法,哈希算法根据其性质将相同的输入哈希为相同的输出,因此,如果我获得了您的数据库的完整转储,我只需运行SELECT passhash, COUNT(*) AS ct FROM accounts GROUP BY passhash ORDER BY ct DESC LIMIT 1
,然后哇啦 - 那个哈希?那是iloveyou
,然后我可以执行SELECT username FROM accounts WHERE passhash = ?
,那个列表中的每个人?我可以使用密码iloveyou
登录。就是这么简单。盐可以解决这个问题:这个想法是,您为每个帐户生成一个随机数(在创建帐户时),然后将此随机数存储在数据库中。您存储的哈希是将值哈希的结果:CONCATENATE(salt,password)。现在,每个使用iloveyou
作为密码的用户都会得到不同的哈希。要检查密码,您需要将输入的密码作为输入,检索盐,重新创建CONCAT(salt,pass),对其进行哈希,并确认
英文:
Why are passwords in char[] form?
> I know that char[] is preferable over String when it comes to passwords, due to the fact that String is immutable.
If you mean: So now you can't change passwords. That's incorrect.
Perhaps you are referring to the fact that you can't wipe them clean', but just in case you aren't, let me explain that:
There is only one reason that char[]
is preferable over String, and it is highly dubious: With a char[]
, you can 'wipe' the array out. That is, once your process (your java code) no longer needs the password, you can explicitly run Arrays.fill(thePassword, (char) 0);
and now RAM no longer contains the password. You can't do this with string - you can null out your reference, but that's just 'wiping out the treasure map'. It's not digging up the treasure and smashing the treasure to bits. Someone willing to dig up the entire beach is still going to find it.
This sounds great, and is the only explanation for why a whole bunch of password-based APIs deal in char[]
and not String
. HOWEVER, this is extremely dubious as a principle and you should absolutely not rely on it:
- If some other process that you cannot trust has access to your process's memory you are quite hosed already. You should fix that at the source, and not try to mitigate this problem by reducing your exposure. That's akin to having a giant gaping hole in your artery and fixing it by hooking up a constant supply of blood bags (fighting symptoms) instead of bandaging the wound (closing the actual hole). If somehow the hole cannot be closed, I guess the symptom fighting is better than nothing, but it's a poor alternative.
- You have no actual guarantee, between OS and CPU caches, that wiping out your char array gaurantees that the password is no where in any part of the hardware that a hacker could feasibly get at.
For what it is worth, I would not judge any code that stores strings in passwords as insecure. In fact, I dread code that stores them in char[] form - I fear that the authors will misunderstand the protection that this gives, or that they straight up forget to zero it out: It's making readers of the code make false presumptions (namely, that the password is wiped out, which may not be true, and that this means that the password is not recoverable if anything manages to get a dump of the memory contents, which also probably isn't true).
NB: With security it's a good idea to get into the habit of writing James Bond film scripts. Here's a film script for you:
You run your server in a cloud hosting environment, on a virtualized PC. The operator of this cloud messed up, and upon server deletion (which is really just the termination of a virtual PC running on a host that is running hundreds of PCs), they do not wipe out the memory of the process. Somebody decides to try to abuse this: They ask the cloud hoster to give them a PC with linux on it, they then install a simple app that scans the virtual PC's own memory for anything that looks like a password and reports back, then shuts the machine down and terminates it, and asks for another one.
By using char[]
AND wiping out the password, you're more safe against this.. but not if the server just crashes or gets hard-killed _which is exactly how e.g. Amazon EC2 is supposed to be used (see netflix's chaos monkey documentation, which the community at large generally agrees is the right way to do things), in which case there's a risk some password just so happened to be in char[] phase. Not to mention all the other places in memory these can end up in.
See how bond-level scripts help clarify matters? It shows there is SOME point to wiping out char arrays, but it's not foolproof.
How do I store a password in postgres
hashing it is also no good. At least, depending on what you mean with the word 'hash'.
The proper approach involves 2 things:
-
A 'salt' - the problem is, a LOT of people have
iloveyou
as a password. It doesn't matter what hashing algorithm you use, hashing algorithms by their nature hash the same input to the same output, so all I need to do if I get a complete dump of your DB is to runSELECT passhash, COUNT(*) AS ct FROM accounts GROUP BY passhash ORDER BY ct DESC LIMIT 1
, and voila - that hash? That'siloveyou
, and I can then doSELECT username FROM accounts WHERE passhash = ?
, and everybody in that list? I can log in as them with passiloveyou
. Simple as that. A salt solves this problem: the idea is, you generate a random number for every account (upon account creation), you then store this random number in the database. The hash you store is the result of hashing the value: CONCATENATE(salt, password). Now every user that is an idiot and usesiloveyou
as password ends up with a different hash. To check a password, you take the password as entered, retrieve the salt, recreate CONCAT(salt, pass), hash that, and confirm it matches the DB entry. -
You want a hashing algorithm that is slow and weird. SHA-256, for example, is used by bitcoin, so there are a ton of dedicated machines out there that can generate billions of hashes a millisecond for SHA-256, and there's specialized hardware out there (hardware you don't have) that is far faster at it than any computer you own. That's bad, you don't want the hacker to have an advantage. So, you want a hash algo that is really slow, and is not (easily) optimized for custom hardware. BCrypt, SCrypt, and PBKDF are the commonly used variants for this. They're all fine (BCrypt is older and in that sense 'worse', but also simpler and proven. Pick whichever one you want).
-
Note that most libraries out there will take care of the whole salt business already, no need to explicitly generate and store these (the APIs of these libraries are simply: "Give me a thing to put in a DB for this password" and "The user entered this password, and this is the thing you told me to put in the DB earlier. Is it a match?". 'The thing to put in the DB' contains both the salt and the hash result combined into a single string.
Okay, so how do I get this 'thing to put in DB' that the bcrypt lib gave me to postgres?
It's going over a wire or at least a local socket, into the postgres WAL log, which then goes everywhere. The thing is also literally right there in the DB.
The notion that you can somehow eliminate this hashes value from your system hardware is completely impossible.
So just make a java.lang.String. It's fine.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论