What's on Jason's Hard Drive
Subject:   Paper paper paper
Date:   2006-11-07 12:38:44
From:   alex_on_the_boat
I’ve been using a simpler system for several years. Not necessarily better.
I’m so glad to see this article and replies since I now know I am not alone in doing this.

Living on a boat and hence small space, space is at a premium. Digitising paperwork makes it so much easier.
This has expanded to take in magazine articles too.
Now it includes films and music plus photo’s.

The caveats are the same.

The investment in a decent scanner hurts and becomes a business class model when fitted with an Automatic Document Feeder. You need this.

The time it takes me to scan the documents seems to far outweigh the value of the information. Then when I need something, being able to find it is priceless.
Out of 100% of the information only 20% or less has any future relevance.
However when it came to a matrimonial divorce the value of the information far outweighed the apparent effort at the time and left me aghast with what I was able to provide and prove. Don’t marry someone who runs this system!!

I gave up with OCR as the additional time and correction penalty was more than I could handle.

The real issue is being able to search for stuff.
A filename won’t contain enough information and remembering how and where you stored stuff for me became a problem to which I haven’t found an answer. This means ‘metadata’ is also required. Soft links sounds interesting.

The Adobe system sounds interesting.

The utopia as I currently see it is to be able to search information and report on it.
That is how many minutes I spend talking to Jane on my mobile phone over the last 4 months, how much I spent on fuel for the car against the last car, how much on food. To do this without having to enter the information into many silos and then design reporting on it would be superb. I want to live life, not spend life scanning stuff or sat before that and ‘Quickbooks’.

I figure some sort of SQL engine would be it, but not across TIF’s or scans but across the information contained within those scans which is OCR’d metadata.