ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button
Article:
  Top Ten Data Crunching Tips and Tricks
Subject:   "read the input into memory" considered harmful
Date:   2005-06-14 08:28:21
From:   johnsaalwaechter
I don't agree with point #2, especially the advice to read the input into memory. I work in an environment with large data sets, and I've seen many instances where a script that reads the input into memory is developed on small test data, then unleashed on gigabytes of real data. Either the process exceeds its 4GB address space (most perl executables are still 32-bit), or the server itself runs out of memory.


Definitely using strategies other than "suck it all into memory" is required in many environments.


1 to 1 of 1
1 to 1 of 1