Golang编码字符串为UTF16小端,并使用MD5进行哈希。

huangapple go评论81阅读模式
英文:

Golang encode string UTF16 little endian and hash with MD5

问题

我是一个Go初学者,遇到了一个问题。
我想用UTF16小端编码一个字符串,然后用MD5(十六进制)进行哈希。我找到了一段Python代码,可以完全满足我的需求。但是我不知道如何将其转换为Go语言。

md5 := md5.New()
md5.Write([]byte(challenge))
response := hex.EncodeToString(md5.Sum(nil))

其中,challenge是一个包含字符串的变量。

英文:

I am a Go beginner and stuck with a problem.
I want to encode a string with UTF16 little endian and then hash it with MD5 (hexadecimal). I have found a piece of Python code, which does exactly what I want. But I am not able to transfer it to Google Go.

md5 = hashlib.md5()
md5.update(challenge.encode('utf-16le'))
response = md5.hexdigest()

The challenge is a variable containing a string.

答案1

得分: 9

你可以通过使用golang.org/x/text/encodinggolang.org/x/text/transform来减少工作量(或者至少更易理解,我个人认为),从而避免过多的手动字节切片处理。等效的函数如下:

func utf16leMd5(s string) []byte {
    enc := unicode.UTF16(unicode.LittleEndian, unicode.IgnoreBOM).NewEncoder()
    hasher := md5.New()
    t := transform.NewWriter(hasher, enc)
    t.Write([]byte(s))
    return hasher.Sum(nil)
}
英文:

You can do it with less work (or at least more understandability, IMO) by using golang.org/x/text/encoding and golang.org/x/text/transform to create a Writer chain that will do the encoding and hashing without so much manual byte slice handling. The equivalent function:

func utf16leMd5(s string) []byte {
    enc := unicode.UTF16(unicode.LittleEndian, unicode.IgnoreBOM).NewEncoder()
    hasher := md5.New()
    t := transform.NewWriter(hasher, enc)
    t.Write([]byte(s))
    return hasher.Sum(nil)
}

答案2

得分: 5

你可以使用unicode/utf16包进行UTF-16编码。utf16.Encode()函数返回Unicode代码点序列([]rune类型)的UTF-16编码。你可以将一个string转换为[]rune类型的切片,例如[]rune("some string"),然后通过遍历uint16类型的代码并将低字节和高字节依次发送/追加到输出中,就可以轻松生成小端编码的字节序列(这就是小端编码的含义)。

对于小端编码,你还可以使用encoding/binary包:它有一个导出的LittleEndian变量和一个PutUint16()方法。

至于MD5校验和,crypto/md5包提供了你需要的功能,md5.Sum()函数简单地返回传递给它的字节切片的MD5校验和。

下面是一个实现你要求的功能的小函数:

func utf16leMd5(s string) [16]byte {
    codes := utf16.Encode([]rune(s))
    b := make([]byte, len(codes)*2)
    for i, r := range codes {
        b[i*2] = byte(r)
        b[i*2+1] = byte(r >> 8)
    }
    return md5.Sum(b)
}

使用它:

s := "Hello, playground"
fmt.Printf("%x\n", utf16leMd5(s))

s = "エヌガミ"
fmt.Printf("%x\n", utf16leMd5(s))

输出:

8f4a54c6ac7b88936e990256cc9d335b
5f0db9e9859fd27f750eb1a212ad6212

Go Playground上尝试一下。

使用encoding/binary的变体如下:

for i, r := range codes {
    binary.LittleEndian.PutUint16(b[i*2:], r)
}

(尽管这种方法较慢,因为它创建了大量的新切片头部。)

英文:

You can use the unicode/utf16 package for UTF-16 encoding. utf16.Encode() returns the UTF-16 encoding of the Unicode code point sequence (slice of runes: []rune). You can simply convert a string to a slice of runes, e.g. []rune("some string"), and you can easily produce the byte sequence of the little-endian encoding by ranging over the uint16 codes and sending/appending first the low byte then the high byte to the output (this is what Little Endian means).

For Little Endian encoding, alternatively you can use the encoding/binary package: it has an exported LittleEndian variable and it has a PutUint16() method.

As for the MD5 checksum, the crypto/md5 package has what you want, md5.Sum() simply returns the MD5 checksum of the byte slice passed to it.

Here's a little function that captures what you want to do:

func utf16leMd5(s string) [16]byte {
	codes := utf16.Encode([]rune(s))
	b := make([]byte, len(codes)*2)
	for i, r := range codes {
		b[i*2] = byte(r)
		b[i*2+1] = byte(r >> 8)
	}
	return md5.Sum(b)
}

Using it:

s := "Hello, playground"
fmt.Printf("%x\n", utf16leMd5(s))

s = "エヌガミ"
fmt.Printf("%x\n", utf16leMd5(s))

Output:

8f4a54c6ac7b88936e990256cc9d335b
5f0db9e9859fd27f750eb1a212ad6212

Try it on the Go Playground.

The variant that uses encoding/binary would look like this:

for i, r := range codes {
	binary.LittleEndian.PutUint16(b[i*2:], r)
}

(Although this is slower as it creates lots of new slice headers.)

答案3

得分: 2

所以,作为参考,我使用了这个完整的Python程序:

import hashlib
import codecs

md5 = hashlib.md5()
md5.update(codecs.encode('Hello, playground', 'utf-16le'))
response = md5.hexdigest()
print(response)

它打印出 8f4a54c6ac7b88936e990256cc9d335b

这是Go语言的等效代码:https://play.golang.org/p/Nbzz1dCSGI

package main

import (
	"crypto/md5"
	"encoding/binary"
	"encoding/hex"
	"fmt"
	"unicode/utf16"
)

func main() {
	s := "Hello, playground"

	fmt.Println(md5Utf16le(s))
}

func md5Utf16le(s string) string {
	encoded := utf16.Encode([]rune(s))

	b := convertUTF16ToLittleEndianBytes(encoded)

	return md5Hexadecimal(b)
}

func md5Hexadecimal(b []byte) string {
	h := md5.New()
	h.Write(b)
	return hex.EncodeToString(h.Sum(nil))
}

func convertUTF16ToLittleEndianBytes(u []uint16) []byte {
	b := make([]byte, 2*len(u))
	for index, value := range u {
		binary.LittleEndian.PutUint16(b[index*2:], value)
	}
	return b
}
英文:

So, for reference, I used this complete python program:

<!-- language: lang-py -->

import hashlib
import codecs

md5 = hashlib.md5()
md5.update(codecs.encode(&#39;Hello, playground&#39;, &#39;utf-16le&#39;))
response = md5.hexdigest()
print response

It prints 8f4a54c6ac7b88936e990256cc9d335b

Here is the Go equivalent: https://play.golang.org/p/Nbzz1dCSGI

package main

import (
	&quot;crypto/md5&quot;
	&quot;encoding/binary&quot;
	&quot;encoding/hex&quot;
	&quot;fmt&quot;
	&quot;unicode/utf16&quot;
)

func main() {
	s := &quot;Hello, playground&quot;

	fmt.Println(md5Utf16le(s))
}

func md5Utf16le(s string) string {
	encoded := utf16.Encode([]rune(s))

	b := convertUTF16ToLittleEndianBytes(encoded)

	return md5Hexadecimal(b)
}

func md5Hexadecimal(b []byte) string {
	h := md5.New()
	h.Write(b)
	return hex.EncodeToString(h.Sum(nil))
}

func convertUTF16ToLittleEndianBytes(u []uint16) []byte {
	b := make([]byte, 2*len(u))
	for index, value := range u {
		binary.LittleEndian.PutUint16(b[index*2:], value)
	}
	return b
}

huangapple
  • 本文由 发表于 2015年11月15日 00:33:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/33710672.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定