英文:
How is VALUE in Ruby sometimes a pointer, and sometimes not?
问题
他们基本上假设前3位(或更多)不能设置为指针。除非前3位设置了,否则他们将使用VALUE
作为指针。他们怎么确定以后不会收到前3位中任何一位设置的指针呢?
英文:
I mean they basically assume that there can't be a pointer with any of the first 3 bits (or more) set.
Here's a function that determines the type of a value:
static inline enum ruby_value_type
rb_type(VALUE obj)
{
if (! RB_SPECIAL_CONST_P(obj)) {
return RB_BUILTIN_TYPE(obj);
}
else if (obj == RUBY_Qfalse) {
return RUBY_T_FALSE;
}
else if (obj == RUBY_Qnil) {
return RUBY_T_NIL;
}
else if (obj == RUBY_Qtrue) {
return RUBY_T_TRUE;
}
else if (obj == RUBY_Qundef) {
return RUBY_T_UNDEF;
}
...
RUBY_Q*
are constants:
RUBY_Qfalse = 0x00, /* ...0000 0000 */
RUBY_Qnil = 0x04, /* ...0000 0100 */
RUBY_Qtrue = 0x14, /* ...0001 0100 */
RUBY_Qundef = 0x24, /* ...0010 0100 */
RUBY_IMMEDIATE_MASK = 0x07, /* ...0000 0111 */
RUBY_FIXNUM_FLAG = 0x01, /* ...xxxx xxx1 */
RUBY_FLONUM_MASK = 0x03, /* ...0000 0011 */
RUBY_FLONUM_FLAG = 0x02, /* ...xxxx xx10 */
RUBY_SYMBOL_FLAG = 0x0c, /* ...xxxx 1100 */
...
RUBY_SPECIAL_SHIFT = 8 /**< Least significant 8 bits are reserved. */
static inline bool
RB_SPECIAL_CONST_P(VALUE obj)
{
return RB_IMMEDIATE_P(obj) || obj == RUBY_Qfalse;
}
static inline bool
RB_IMMEDIATE_P(VALUE obj)
{
return obj & RUBY_IMMEDIATE_MASK;
}
static inline enum ruby_value_type
RB_BUILTIN_TYPE(VALUE obj)
{
...
VALUE ret = RBASIC(obj)->flags & RUBY_T_MASK;
return RBIMPL_CAST((enum ruby_value_type)ret);
}
#define RBASIC(obj) RBIMPL_CAST((struct RBasic *)(obj))
So, unless the first 3 bits are set, they use VALUE
as a pointer. What makes them sure they won't one day receive a pointer with any of those bits set?
答案1
得分: 3
所有来自Ruby的对象指针都将来自于malloc
等。
根据malloc
的手册页面:
> malloc()
和calloc()
函数返回指向分配的内存的指针,该指针适合于任何内置类型。
因此,malloc
必须允许(例如):
double *dptr = malloc(sizeof(*dptr));
double
的对齐要求为8字节。
这对于所有[所有符合POSIX的]系统都是真实的。
实际上,通常是CPU架构决定对齐方式。
- 一些系统要求这种对齐方式(否则访问将导致硬件生成对齐异常)。
- 即使硬件能够容忍非对齐的读取/存储,软件也会生成自然对齐方式[有时出于性能原因-对齐访问可以更快]。
因此,至少指针的最不显著的3位必须为0。
对于某些架构,编译器支持(例如)__int128
,因此所需的对齐方式为16。
此外,某些系统(例如x86
)还支持大小为16字节的SIMD类型。因此,对齐方式为16字节。
对于16字节对齐,最不显著的4位将为0。
尽管您只依赖于8字节对齐[以及3的最不显著位为0],但通常[但不总是]最不显著的4位为0。
因此,您总是可以依赖于最不显著的3位为0。
英文:
All pointers to objects from ruby will come from malloc
et. al.
From the man page for malloc
:
>The malloc()
and calloc()
functions return a pointer to the allocated
memory, which is suitably aligned for any built-in type.
Thus, malloc
must allow (e.g.):
double *dptr = malloc(sizeof(*dptr));
The alignment for double
is 8 bytes.
This is true for all [all POSIX compliant] systems.
Actually, it's usually the CPU architecture that dictates the alignment.
- Some systems require such alignment (or an access will cause the H/W to generate an alignment exception).
- Even if the H/W will tolerate a non-aligned fetch/store, the S/W will generate the natural alignment anyway [sometimes for performance reasons--the aligned access can be faster].
So, at a minimum, the least significant bits 3 bits of a pointer must be 0.
For some arches, the compiler supports (e.g.) __int128
, so the required alignment would be 16.
Also, some systems (e.g. x86
) also support SIMD types which are 16 bytes. So, again, an alignment of 16 bytes.
For 16 byte alignment, the least significant 4 bits would be 0.
Although you only rely on 8 byte alignment [and LSB of 3 being 0], frequently
that the least significant 4 bits are 0.So, always, you can rely on the least significant 3 bits being 0.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论