The Ruby VALUE
by Caleb Tennis
The first point of interest is the VALUE - Ruby's internal representation of its objects. In the general sense, a VALUE is just a C pointer to a Ruby object data type. We use VALUEs in the C code like we would use objects in the Ruby code.
One would expect that the VALUE is just a typedef to a C pointer and there's a lookup table as to which object it represents, and this would be partially correct. However, there's also some trickery involved.
Instead of implementing the VALUE as a pointer, Ruby implements it as an unsigned long. It just so happens that sizeof(void *) == sizeof(long) - at least on the platforms I'm familiar with. After all, what is a pointer? It's just an n-byte integer that represents a memory address.
But because of this, there's some tricks Ruby can perform.
First, for performance purposes, Ruby doesn't use the VALUE as a pointer in every instance. For Fixnums, Ruby stores the number value directly in the VALUE itself. That keeps us from having to keep a lookup table of every possible Fixnum in the system.
The trick lies in the fact that pointers are aligned in 4 byte chunks ( 8 bytes on 64 bit systems ). For example, if there was an object stored at 0x0000F000, then the next would be one stored at 0x0000F004. This jump from 00 to 04 in the lower nibble is important. Expanding out as bits, it is: 00000000 and 0000100. This means that if we use the VALUE as a pointer, the lowest two bits will always be 0s.
Ruby uses this to its advantage. It will tuck a 1 in the lowest bit, and then use the rest of the space (31 bits) to store a Fixnum. One of the bits will be used for the sign, so a Ruby Fixnum can be up to 30 bits in length.
irb(main):021:0> (2 ** 30).class
irb(main):022:0> (2 ** 30 - 1).class
irb(main):024:0> (-(2 ** 30)).class
irb(main):025:0> (-(2 ** 30)-1).class
Ruby uses the other bit to help distinguish other common types, like false, true, and nil. Symbols and their IDs are also stored with this bit on, so Ruby recognizes it as a special instance and interprets accordingly.
The rest of the time a VALUE is a good old fashioned memory address, which points to an object structure in memory.
So there you have it. I hope this little snippet was of some VALUE to you.
|Is this why the object_id of nil is 4? Does that mean object_id is just an address? What would the object_id of a Fixnum, true and false be in that case?|
|It's posts like these that make me wish for a way of "tipping" the author so that in addition to a comment about how useful the post was, the author could derive some economic benefit from it and feel an incentive to write another along the same lines. Lacking that ability, I'll have to settle for offering praise at a great topic that there's not enough content on the web about. On the other hand, you appear to live in the Chicago area, so perhaps I can bribe you with beer...|
|This helps clear up some issues with understanding symbol usage and the benefits. Thanks for the insight!|
Justin: you're on to something, and I think I'll probably write about it in my next post.
Thanks for the praise and I appreciate the virtual tip. I plan to keep writing this kind of stuff as long as I have the material and people find it interesting. In the meantime, you can always tell the O'Reilly folks what you think about the entries - the Contact Us link at the bottom of this page has some e-mails of people who I'm sure would like to hear your feedback.
|jperkins: Click on a banner ad - it's easy enough and does provide a little tip (tho in this case probably to O'Reilly rather than Caleb).|
>> It just so happens that sizeof(void *) == sizeof(long) - at least on the platforms I'm familiar with. <<
Thanks for the post, I've been looking for this kind of stuff about Ruby. I've heard there's a book that focuses on the internals of Ruby in (sigh, what else) Japanese, but my Japanese is non-existant, so that's out of my reach.
|Very informative, keep up the great work ;-)|