使用base64对字符串值进行编码

huangapple go评论74阅读模式
英文:

Encoding String values with base64

问题

package Homework;

import static java.nio.charset.StandardCharsets.UTF_8;
import java.util.Base64;
import java.util.Base64.Encoder;
import java.util.Scanner;

public class HW4 {
    public static String b64enc(String string) throws Exception {
        Encoder encoder = Base64.getEncoder();
        byte[] data = string.getBytes(UTF_8);
        String encodedString = encoder.encodeToString(data);
        return encodedString;
    }

    public static void main(String[] args) throws Exception {
        Scanner scan1 = new Scanner(System.in);
        System.out.println("Please enter the first String: ");
        String string1 = scan1.nextLine();
        System.out.println("Please enter the second string: ");
        String string2 = scan1.nextLine();
        scan1.close();
        String encodedString = b64enc(string1 + string2);
        System.out.println(encodedString);
    }
}

Sample Output:

User Input:

Please enter the first String:
Hello
Please enter the second string:
World

Encoded Output (Your Code):

SGVsbG9Xb3JsZA==

Encoded Output (Expected Output):

SGVsbG8gV29ybGQ=

Note: The input values "Hello" and "World" were used for demonstration. The discrepancy between your output and the expected output may be due to differences in handling whitespace or other characters. Make sure to check the input and encoding process for accuracy.

英文:

so I was working on a homework assignment for my CSC420 class. Professor wanted us to use java code to encrypt two string values that the user would enter. I was able to do, no real issue there; the main problem is that the sample output that he gave to us so that we would know if we got the "right answer" is somehow different then mine. I have attached my code below, my output, and his output; if someone could tell me what I am doing wrong, that would be greatly appreciated.

package Homework;

import static java.nio.charset.StandardCharsets.UTF_8; 
import java.util.Base64; 
import java.util.Base64.Encoder; 
import java.util.Scanner; 

public class HW4 { 
  public static String b64enc(String string) throws Exception { 

	Encoder encoder = Base64.getEncoder(); 
	byte[] data = string.getBytes(UTF_8); 
	String encodedString = encoder.encodeToString(data); 

return encodedString; 
} 

public static void main (String [] args) throws Exception { 

	Scanner scan1 = new Scanner(System.in);
	  System.out.println("Please enter the first String: ");
	  String string1 = scan1.nextLine();
	  System.out.println("Please enter the second string: ");
	  String string2= scan1.nextLine();
	  scan1.close();  
	  String encodedString = b64enc(string1 + string2);
	  System.out.println(encodedString);
	}
} 

![Text使用base64对字符串值进行编码(https://stackoverflow.com/image.jpg)

使用base64对字符串值进行编码

答案1

得分: 2

你教授的程序对于'hihello'和'hellohi'这两个输入具有相同的控制值,这是特殊的;显然,仅对字符串进行Base64编码(不会删除信息;你可以使用它恢复到原始状态;这就是重点)意味着不可能让两个不同的输入生成相同的输出。

我得出的结论是你可能没有正确阅读说明。你正在寻找一种算法,其中以不同顺序输入字符串仍然会产生相同的'编码值',这一点有解释。'将它们连接起来,然后对结果进行Base64编码'是行不通的。

英文:

The fact that your prof's program has the same control value for 'hihello' and 'hellohi' is special; obviously, just Base64-encoding a string (which doesn't delete information; you can get back to the original with it; that is the point) implies it is impossible for 2 different inputs to generate the same output.

I conclude that you must not have read the instructions correctly. You're looking for an algorithm where entering the strings in a different order nevertheless produces the same 'encoded value' is explained. 'concatenate them and then base64 the result' wouldn't.

答案2

得分: 2

我现在认为我理解了这个任务。

这段代码的目标是使用另一个字符串作为密钥,对一个"message"字符串进行编码。Base64仅用于对结果进行编码,因为它是二进制数据,可能(会)包含不可打印的字符编码 - 因此结果以文本形式表示,例如可以发送给老师。

首先,我们注意到字符串的顺序并不重要,所以哪个是密钥,哪个是消息没有实际区别。

接下来,我们可以解码示例结果,例如使用Linux命令base64(我使用了Git Bash,但也有在线服务可用于此)。我还将结果导向了od(十六进制转储工具,用于查看十六进制值):

$ echo "AAxMTE8=" | base64 -d | od -t x1 -c

这将返回:

0000000  00  0c  4c  4c  4f
         
0000000  00  0c  4c  4c  4f
\0  \f   L   L   O
\f L L O

注意它是5字节长,与输入字符串中较长的长度相同 - 因此我们可以假设字符串并未连接,否则长度会改变,而是每个字符串的字节以某种方式组合在一起。还要注意每个字符使用一个字节,所以编码可能是UTF-8甚至是ASCII

此外,我们看到结果以"LLO"结尾,是最长输入"hello"末尾的大写版本 - 看起来字节的位置没有改变,只是通过某种操作组合了值。让我们考虑一些可能用于合并字节的操作:

  • 减法除法:不适用,因为输入的顺序无关紧要;
  • 加法乘法:不好,因为可能会发生溢出/下溢;
  • 位运算的ANDNANDORNOR:不适用,会丢失信息(例如,x AND 0 总是 0);
  • 位运算的XOR(异或):(几乎)完美,易于加密,易于解密,顺序无关紧要(但不是非常强大)。

让我们来看看使用XOR会发生什么:

input1: "hello" == [ 0x68, 0x65, 0x6c, 0x6c, 0x6f ] // 使用ASCII/UTF-8
input2: "hi"    == [ 0x68, 0x69 ]                   // 同上
result: "∅∅LLO" == [ 0x00, 0x0c, 0x4c, 0x4c, 0x4f ] // ∅ 不可打印

0x68 ^ 0x68 == 0x00  // 正确!
0x65 ^ 0x69 == 0x0f  // 同上
0x6c ^  X   == 0x4c  // X 是什么?
0x6f ^  X   == 0x4f  // 同上

现在我们只需要弄清楚最后3个字节会发生什么,一个输入太短了,也就是说,X是什么。很容易发现0x6c ^ 0x20 == 0x4c,以及0x6f ^ 0x20 == 0x4f,实际上A ^ X == B 意味着 A ^ B == X。因此我们得出结论,较小的字符串必须用0x20或空格字符 ' ' 填充。


该算法大致是这样的:通过将较小的字符串附加空格(' '),使两个输入字符串的大小相同。将两个输入转换为字节数组。使用异或运算组合每个数组的字节。然后对结果进行Base64编码。

英文:

I think I do understand the assignment now.

The code is supposed to encode one message string using another string as key. Base64 is used only for encoding the result since it is binary data and can (will) contain codes that are not printable - so the result is represented as text and, for example, can be mailed to the teacher.

First we note that the order of the strings does not matter, so there is no real distinction which is key, which is message.

Next we can decode the example results, for example using linux command base64 (I used GIT Bash, but there are also online services available for this). I also piped the result to od (hex dump utility to see hexadecimal values):

$ echo "AAxMTE8=" | base64 -d | od -t x1 -c

which returns

> 0000000 00 0c 4c 4c 4f
> \0 \f L L O

Note that it is 5 bytes long, the same as the longer of the input string - so we can assume that the strings are not being concatenated, which would change the length, but that the bytes of each string are being combined in some way. Also that each character is using one byte, so encoding probably is UTF-8 or even ASCII.

Further we see that the result ends with "LLO", the uppercase version of the end of the longest input "hello" - looks like the position of the bytes were not changed, just the values combined by some operation. Lets consider some operations that can be used to combine the bytes:

  • Subtraction or division: won't work since order of input does not matter;
  • Addition or multiplication: not good because of possible overflow/underflow
  • Bitwise AND, NAND, OR or NOR: won't work, information loss (e.g. x AND 0 is always 0)
  • Bitwise XOR (exclusive OR): (almost) perfect, easy to encrypt, easy to decrypt, order does not matter (but not very strong)

Lets check what happens with XOR:

input1: "hello" == [ 0x68, 0x65, 0x6c, 0x6c, 0x6f ] // using ASCII/UTF-8
input2: "hi"    == [ 0x68, 0x69 ]                   // "
result: "∅∅LLO" == [ 0x00, 0x0c, 0x4c, 0x4c, 0x4f ] // ∅ not printable

0x68 ^ 0x68 == 0x00  // correct!
0x65 ^ 0x69 == 0x0f  // "
0x6c ^  X   == 0x4c  // what is X?
0x6f ^  X   == 0x4f  // "

Now we just need to see what should happen with the last 3 bytes, one input is too short, that is, what is X. It is not hard find out that 0x6c ^ 0x20 == 0x4c and 0x6f ^ 0x20 == 0x4f, actually A ^ X == B implies that A ^ B == X. So we conclude that the smaller string must be filled up with 0x20 or the white-space character ' '.


The algorithm must be something like: make both input strings the same size by appending white-spaces (' ') to the smaller string. Convert both inputs to byte array. Combine the bytes of each array using exclusive OR. Encode the result using Base64.

huangapple
  • 本文由 发表于 2020年3月4日 05:43:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/60515957.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定