英文:
Type Safe Vs Loosely typed - GO Vs C
问题
在C语言中,以下是嗅探器代码:
void Handle_IP(char *buf)
{
struct iphdr *ip_hdr; // 声明指向ip头的指针
struct in_addr in; // 声明一个结构体来保存ip地址
FILE *fp;
int ctl, len;
/* 在下面的语句中,我们调整偏移量,使得ip指针可以指向正确的位置 */
ip_hdr = (struct iphdr *)(buf + 14);
....
}
其中,ip_hdr = (struct iphdr *)(buf + 14)
将 char *
类型转换为 struct iphdr *
类型。
Go是一种类型安全的语言,不允许这种类型转换。Go语言中没有 void *
,只有遵循结构化类型的 interface
。
在Go中如何处理这样的代码呢?
英文:
In C, following is the sniffer code,
void Handle_IP(char *buf)
{
struct iphdr *ip_hdr; // declaring pointer of type ip header
struct in_addr in; // declaring a structure which holds ip address
FILE *fp;
int ctl, len;
/* In the following statement we're adjusting the offset so that
ip pointer can point to correct location */
ip_hdr = (struct iphdr *)(buf + 14);
....
}
where line,
ip_hdr = (struct iphdr *)(buf + 14)
has char *
that is type casted to struct iphdr *
.
GO being a type safe language, does not allow such type casting. Go has no void *
except interface
that follows structural typing
How to approach such code in GO?
答案1
得分: 6
让我们从不同的角度来看待这个问题:在C语言中,类型转换(typecasting)是做什么的?
在某种意义上,它其实什么都不做!它只是在内存中的某个位置取出一些位模式,并盲目地假设这个特定的位模式实际上表示某种类型的值。但是没有对此进行验证,也没有进行任何解析。这就是问题所在。
但是,我已经在上面提到了解决我们问题的关键词:解析(parsing)。我们不再盲目地假设内存中的某个随机位模式与C编译器为某种类型的值生成的位模式相同,而是实际上查看位模式,检查它是否符合该数据类型的位模式构造规则,并根据这些规则将位模式解构为其组成部分。
换句话说:我们编写一个解析器。
请注意,这不仅适用于Go语言,即使在C语言中也是一个好主意。实际上,你发布的代码在至少三个,也许是四个不同的方面存在问题:
char
不是 8 位宽的平台- 对齐
- 填充
- 字节序
据我所知,已经有一些用Go语言编写的现有的IP、UDP和TCP解析器,你可以使用或学习。此外,Alan Kay的Viewpoints Research Institute目前的系统中有一个非常酷的TCP/IP堆栈,只有大约120行左右的代码,非常容易阅读。(实际上,解析器的源代码实际上只包含从RFC中剪切和粘贴的ASCII图表。)
英文:
Let's look at this from a different angle: what does typecasting do in C?
Well, in some sense, it doesn't do anything! It just takes some bit pattern at some point in memory, and blindly assumes that this particular bit pattern actually represents a value of some type. But there's no validation of this, no sort of parsing. And that's the problem.
But, I already used the word that is the solution to our problem above: parsing. Instead of just blindly assuming that some random bit pattern in memory is the same bit pattern that the C compiler would generate for a value of some type, we actually look at the bit pattern, check whether it conforms to the rules of how bit patterns for that datatype are constructed, and deconstruct the bit patterns into its constituent parts according to those rules.
In other words: we write a parser.
Note that this isn't specific to Go. It is a good idea even in C. In fact, the code you posted is broken in at least three, maybe four different ways:
- platforms where a
char
is not 8 bits wide - alignment
- padding
- endianness
AFAIK, there are already existing IP, UDP, and TCP parsers written in Go out there, that you can either use or learn from. Also, the current system from Alan Kay's Viewpoints Research Institute has a really cool TCP/IP stack written in just 120 lines or so, which is very easy to read. (In fact, the source code for the parser actually just consists of ASCII diagrams cut&pasted from the RFCs.)
答案2
得分: 3
**前言:**尽管你想要的是可能的,但尽量避免使用unsafe
包。这不是一个“首选”解决方案,而是一种最后的手段(或者是在其他手段之后使用的东西)。
Go语言确实在unsafe
包中为此提供了支持。
甚至规范中都有一个专门的章节介绍了它。规范:unsafe
包
内置的
unsafe
包是编译器所知的,它提供了违反类型系统的低级编程功能。使用unsafe
的包必须经过手动检查以确保类型安全,并且可能不可移植。
这个包故意被命名为unsafe
,以便在出现问题时不要责怪编译器或语言本身。
你需要的是unsafe.Pointer
类型。Pointer
是一种特殊的指针类型,它可能不会被解引用,但任何指针类型都可以转换为Pointer
,而Pointer
可以转换为任何(其他)指针类型。因此,这是不同类型之间的“通道”。
例如,如果你有一个float64
类型的值(在内存中占据8个字节),你可以将这8个字节解释为一个int64
类型(也是在内存中占据8个字节):
var f float64 = 1
var i int64
ip := (*int64)(unsafe.Pointer(&f))
i = *ip
fmt.Println(i)
输出结果(在Go Playground上尝试):
4607182418800017408
关键是这一行代码:
(*int64)(unsafe.Pointer(&f))
它的意思是取f
的地址(类型为*float64
),将其转换为unsafe.Pointer
(任何指针类型都可以转换为Pointer
),然后将这个unsafe.Pointer
值转换为另一个指针类型*int64
。如果我们解引用这个指针,就会得到一个int64
类型的值。
在你的例子中,你想要在应用了偏移量的地址上“放置”一个变量。Go语言没有指针运算。你可以通过以下两种方式解决这个问题:
-
使用
uintptr
,它可以保存一个地址,但你可以将其视为一个int
并对其进行加法运算。 -
或者使用指向“缓冲区”类型的指针,例如
*[1<<31]byte
;你可以将地址转换为这个指针,并通过索引给定偏移量的缓冲区来应用偏移量,并取得该元素的地址,例如&buf[14]
。
方法1可能如下所示:
type X [16]byte
var x = X{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
xx := (uintptr)(unsafe.Pointer(&x[0])) + 14 // 应用偏移量
var x2 X = *(*X)(unsafe.Pointer(xx))
fmt.Println(x2[0], x2[1])
输出结果(在Go Playground上尝试):
14 15
方法2可能如下所示:
type X [16]byte
var x = X{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
xx := (*X)(unsafe.Pointer(&x[0]))
xx = (*X)(unsafe.Pointer(&xx[14])) // 应用偏移量
var x2 X = *(*X)(unsafe.Pointer(xx))
fmt.Println(x2[0], x2[1])
输出结果与前面相同。在Go Playground上尝试。
英文:
Foreword: Even though what you want is possible, try to avoid using package unsafe
as much as possible. This is not a "go to first" solution but rather think of it as a last resort (or something that comes after that).
Go does give you support for this in package unsafe
.
Even the spec has a dedicated section for it. Spec: Package unsafe
:
> The built-in package unsafe
, known to the compiler, provides facilities for low-level programming including operations that violate the type system. A package using unsafe
must be vetted manually for type safety and may not be portable.
The package was intentionally named unsafe
, giving you the proper prior-warning that if something goes wrong, don't blame the compiler or the language.
What you need is the unsafe.Pointer
type. Pointer
is a special pointer type that may not be dereferenced, but any pointer type can be converted to Pointer
, and Pointer
can be converted to any (other) pointer type. So this is your "gateway" between different types.
For example, if you have a value of type float64
(which is 8 bytes in memory), you can interpret those 8 bytes as an int64
(which is also 8 bytes in memory) like this:
var f float64 = 1
var i int64
ip := (*int64)(unsafe.Pointer(&f))
i = *ip
fmt.Println(i)
Output (try it on the Go Playground):
4607182418800017408
The key is this line:
(*int64)(unsafe.Pointer(&f))
It means take the address of f
(which will be of type *float64
), convert it to unsafe.Pointer
(any pointer type can be converted to Pointer
), then convert this unsafe.Pointer
value to another pointer type *int64
. If we dereference this pointer, we get a value of type int64
.
In your example you want to "place" a variable on an address with an offset applied. Go does not have pointer arithmetic. You can get around this in 2 ways:
-
use
uintptr
which may hold an address, but you can treat it as anint
and add values to it -
or use a pointer to a "buffer" type, e.g.
*[1<<31]byte
; you may convert the address to this pointer, and you can apply the offset by indexing the buffer at the given offset, and take the address of that element, e.g.&buf[14]
Method #1 could look likes this:
type X [16]byte
var x = X{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
xx := (uintptr)(unsafe.Pointer(&x[0])) + 14 // APPLY OFFSET
var x2 X = *(*X)(unsafe.Pointer(xx))
fmt.Println(x2[0], x2[1])
Output (try it on the Go Playground):
14 15
Method #2 could look like this:
type X [16]byte
var x = X{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
xx := (*X)(unsafe.Pointer(&x[0]))
xx = (*X)(unsafe.Pointer(&xx[14])) // APPLY OFFSET
var x2 X = *(*X)(unsafe.Pointer(xx))
fmt.Println(x2[0], x2[1])
Output is the same. Try it on the Go Playground.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论