Ruby VALUEs and object_ids

by Caleb Tennis

In the last entry, I talked about Ruby VALUEs and what they mean. A few readers brought up some good points - namely that object_id's and VALUEs look very similiar:

If we go inside Ruby for a second, here's how the object_id is calculated:

rb_obj_id(VALUE obj)
if (SPECIAL_CONST_P(obj)) {
return LONG2NUM((long)obj);
return (VALUE)((long)obj|FIXNUM_FLAG);

You can trace back the SPECIAL_CONST_P macro to point to a few other macros which point to a few other macros, etc.

What ends up happening, in general, is that the VALUE and the object_id are the same.

irb(main):001:0> "some_string".object_id
=> 1136556
irb(main):002:0> nil.object_id
=> 4
irb(main):003:0> false.object_id
=> 0
irb(main):004:0> true.object_id
=> 2
irb(main):005:0> 5.object_id
=> 11

Why does 5 have an object_id of 11? Well, don't forget that Fixnums are stored in the upper 31 bits, which means a shift left of one. We also use the lowest bit to mark that it is a Fixnum.

Thus 0x0101 (5) becomes 0x1011 (11).

If we look back at our string object from above, its VALUE (1136556) correlates to 0b1000101010111110101100. Note that the lower two bits are 0 - they have to be if they represent at least a 4 byte aligned memory address (see previous entry for why).

What about symbols?

irb(main):002:0> :foo.object_id
=> 3895566
irb(main):004:0> :foo.to_i
=> 15217

So :foo's object_id is 0b1110110111000100001110 and its to_i gives 0b11101101110001. First notice that the object_id has a 1 in the 2nd lowest bit. This again denotes a special type. Also note that there's something similiar between the object_id and the to_i value - in fact, if you chop off the right most 8 bits from the object_id, you get the same number as to_i.

This may come into play later.


2006-02-01 18:32:49
And deeper down the rabbit hole we go! This is a very interesting and insightful series. Cant wait for the next installment.
2006-02-02 09:02:41
What a clever scheme! I am very impressed. Do you know if this is something Matz came up with himself, or is this a well-known technique ?

Anyways, thanks for the cool post

2006-02-03 04:20:38
It's a very common technique.

Lisp implementations have been tagging integers/pointers "forever".