Silently Discarding Information?

by Curtis Poe

Disclaimer: I have programmed in both Ruby and Python, but not enough to be familiar with their conventions, so the following could be a serious misunderstanding on my part.

Any Python or Ruby programmers out there able to explain why these languages default to integer math?

$ python -c 'print 7/2'
3

$ ruby -e 'puts 7/2'
3

22 Comments

Steve
2006-12-10 09:13:25
From the PEP below it looks like it's a design bug in Python that they are working on fixing. It's just one of those things you deal with :)


http://www.python.org/dev/peps/pep-0238/

Kevin
2006-12-10 10:59:51
Check out Guido's "Python Regrets" presentation (http://www.python.org/doc/essays/ppt/regrets/PythonRegrets.ppt)... he specifically mentions "int/int returning int" was a mistake. It will be changed.


In fact, you can use the functionality now:
>>> from __future__ import division
>>> 1/2
0.5
>>> 1 // 2 # double-dash performs int division
0

Fred
2006-12-10 11:26:28
In ruby you can use divmod


7.divmod(3) => [3, 1]


One reason for defaulting to int math might be that it was something the authors were so used to coming from C, they just didn't question it.

Taylor Venable
2006-12-10 13:20:48
I think Ruby and Python get that behavior from C. And C probably gets it from the hardware for which it was written.


I'm not sure Perl computes the "correct" result because of dynamic typing. Ruby and Python are both dynamically typed, but Perl is more weakly typed than those other two. As a result, it is happy to coerce the result of an operation on two ints into a float. If the language really differentiated much between ints from floats, this would be a problem. But Perl doesn't -- it coerces back and forth at will. As an example, try retrieving element 3/4 from an array. It works!


Personally, I don't care for any of these methods. I'd prefer that true division only be done on reals, not ints. You could write another function that operated on ints and did integer division, but give it a different name, like `div`. This is the course taken by Standard ML, if I'm not mistaken. Another alternative is to have division represented internally as a fraction, ala Scheme. Again, a div function is supplied for integer division.

NB
2006-12-10 19:53:46
I don't think that your question reflects a "serious misunderstanding" about Ruby or Python, I think it reflects a "serious misunderstanding" about mathematics. It isn't correct to suggest that 0.5 is a more accurate representation of 1/2. The best representation of 1/2 is 1/2. This is obvious when you consider something like 1/9. It would be obviously wrong to say that "0.111111" is a better representation of 1/9 -- 1/9 is clearly the best way to put it. It makes sense to always have to think about making a conversion from a rational number to some decimal representation of it, because there's an excellent chance that you're actually going to *lose* information in the decimal representation.
Beo
2006-12-10 21:18:46
This has been a widely known issue in python for quite some time. It has already been "fixed" and awaits the 3.0 release before it will be the default behaviour. For example, try the following on the command prompt in python:


2/7
from __future__ import division
2/7


The second "2/7" evaluates to the floating point number.
See this pep for more information: http://www.python.org/dev/peps/pep-0238/


2006-12-10 21:21:26
"It makes sense to always have to think about making a conversion from a rational number to some decimal representation of it, because there's an excellent chance that you're actually going to *lose* information in the decimal representation."


I think you missed the point.


3 is _not_ a representation of 7/2, good or bad.

Ovid
2006-12-11 00:09:17

NB wrote: It isn't correct to suggest that 0.5 is a more accurate representation of 1/2. The best representation of 1/2 is 1/2.


With all due respect, you are mistaken. .5 and 1/2 are equal. They can be used interchangeably. It's not true to say that 1/2 is a better representation as which representation to use depends upon how you're going to use them. Multiplying it by 1/9? The fraction is easier. Multiplying it by 2.3? The decimal is easier. It's merely a matter of convenience, in this case.


With 1/9, it's an entirely different story. It's not possible to completely write that out as an equivalent to a decimal number (though there are conventions to represent it), so if I were to suggest that 1/9 equals 0.1111111, you'd be right to take me to task.


However, no amount of claiming I have a serious misunderstanding over such basic math will convince me that 7/2 is equivalent to 3. I lose no information by representing that result into 3.5.


Of course, all of this sidesteps the well-known floating point issues with computer math and how 3.5 might be represented as 3.49999999999 or somesuch. That might have been a valid approach to criticizing my question (and even that FP representation is not necessarily too bad, depending upon the required precision).


So back to my question: why would representing 7/2 as 3 be preferable to a more accurate answer of 3.5 or even 3.499999999999999?


2006-12-11 07:52:21
BTW I believe Perl gets it right because it represents all numbers internally as doubles. This can have its own drawbacks if the programmer is not aware of it.
Adriano Ferreira
2006-12-11 09:04:54

Anonymous said: BTW I believe Perl gets it right because it represents all numbers internally as doubles. This can have its own drawbacks if the programmer is not aware of it.


I think you don't know much about Perl internal representations. You may read at http://perldoc.perl.org/perlnumber.html:


Perl can internally represent numbers in 3 different ways: as native integers, as native floating point numbers, and as decimal strings.


There's a difference between the fact that Perl uses a few data types (scalar, arrays, hashes) and the huge harmonizing effort of the internals to be consistent and efficient (for instance, going from numbers to strings and vice-versa).

Adrian Howard
2006-12-11 09:28:30
Not a Python head, but on the Ruby front take a look at the (standard) 'mathn' module:


% ruby -e "require 'mathn'; puts 7/2"
7/2
(and that's a proper rational too - none of that floating point nonsense :-)


Just like with Perl - with Ruby TIMTOWTDI - they just picked a different default

artlogic
2006-12-11 14:26:46
Ovid Wrote: So back to my question: why would representing 7/2 as 3 be preferable to a more accurate answer of 3.5 or even 3.499999999999999?


As it's been said before, eventually this will act more like you think it should, and you can make it that way now by using: from __future__ import division. That said, I believe that it comes down to a question of what the / operator actually means in any particular language. In the case of Python < 3.0, the / operator means integer division if you have two integers. Is information being discarded? Not in my opinion. In the case of integer division 7 / 2 = 3, and for that matter 7 % 2 = 1 (using the modulus operator). In computer science, integer division and modulus are much more common than floating point division, and I imagine the language implementors chose to implement this behavior as the default due to their familarity with these concepts.


Was this the correct choice? Guido obviously regrets it. Why? IMHO it's because the expected behavior by the general programming public is floating point division. Those who want integer division generally are thinking "integer divide" in their head.


Why hasn't it been fixed yet? Compatibility. Guido does not break compatibility inside of a major revision number (i.e. code written for python 2.0 should work on any python 2.X).


Python does have a warning system - but as it stands -Wall does not display a warning for integer division - this might be a good idea, but I honestly don't know enough about to warning system to say if it is or isn't.


2006-12-11 17:50:59
The reason 7/2 = 3 is because 7 and 2 are integers, hence integer division is used. This is just the approach that makes the most sense from language design standpoint.


The issue really whether or not a high-level language like Python should provide such low-level operation like integer division as a part of the language's grammar.


I think that for high-level languages the answer should be no, but I'll leave it to Guido.



2006-12-12 21:07:49
Java, too, does integer math when you do 7/2 (it results in 3). So Python and Ruby are not the only "culprits".
Ovid
2006-12-12 23:56:11

Anonymous: regarding the "Java also fails" assertion, this is sort of true, but for a statically typed language, it's expected. It's not something I would expect of a dynamically typed language which users generally expect to "do the right thing." Plus, like C, it's trivial to force java to return a float or double:


public class Divide {
public static void main (String args[]) {
float foo = (float)7/3;
System.out.println(foo);
}
}


At least with that, it really doesn't matter what types your numerator and denominator are. You can get around it in Python (with the from __future__ import division) and in Ruby with divmod or other strange tricks, or you could multiple one of the numbers by a float to force a float, but even that fails if you screw up the precedence (7/2*1.0 fails in ruby and python but 1.0*7/2 does not).

Nathan Humble
2006-12-13 00:58:04
People are saying that it's trivial to get the "correct" behavior from Java by casting one of the operands "(float)7/2", well it's just as easy to do in Ruby "7.to_f/2" or "7/2.to_f". The problem is hat 7 and 2 are both integers and division of integers returns an integer. If it didn't work this way, there'd be no easy way to do integer division.


Simply making one of the operands a float will yield floating-point division (with potentially more precision than a traditional float, if necessary). If you really don't like it, you can override the definition of division in the Fixnum class to return a foating-point value.

David21001
2006-12-13 08:56:10
I don't see anything wrong with the behavior -- it's integer division just like it's implemented in C.


I think you slept through your C class, especially when your professor taught the concept of automatic type promotion.


Doesn't make sense why Guido would have any regrets -- integer / integer = integer division, and should result in an integer type.


Ovid
2006-12-13 10:15:05

David21001 wrote:


Doesn't make sense why Guido would have any regrets -- integer / integer = integer division, and should result in an integer type.


It makes perfect sense. Pass lists of numbers to a function which uses division and you can get surprising results if one of the numbers in the list happens to be a int and the others are floats. That can be a nasty source of bugs and that's why Guido regrets it. When that happens, programmers are forced to write extra code to solve a language limitation. Dynamic languages are supposed to lessen our worries about data types, not increase them.

Dale
2006-12-13 12:42:18
You suggested casting the result of the expression in C, but the way you wrote it once again bumps into another aspect of automatic type promotion. (float)7/2 is not casting the result of 7/2 to a float. It is casting 7 to a float and dividing by 2, which will be automatically promoted to match.
Ovid
2006-12-13 13:35:06

Dale: thanks for the clarification.

Dave
2006-12-13 19:05:30
Since the initial shock of Python's integer division has worn off, I find I prefer it. It's extremely convenient for indexing list or array cells, and combining this default behavior with zero-indexed first items makes things even easier. Matlab scripting was a rude shock in that it behaves differently in both cases.
Peter
2007-02-07 18:57:02
umm, I see some misunderstandings in general. Integer arithetic is MUCH faster than floating point which is releativly computationally intensive. further more your decription of c static typing vs ruby/python dynamic typing is just plain wrong. having to cast a variable type is a case of weak typing vs strong typig, static simply means you declare a variable type before you use it vs python and ruby where its decided at runtime.