[ale] Document Imaging under Linux

Armsby John-G16665 John.Armsby at motorola.com
Thu Sep 18 15:49:05 EDT 2003


It is possible to create a web based SEARCH system which RAPIDLY retrieves stored images.  I originally set up a system in Solaris and ported it to Linux.  We have approximately 1,000,000 files. Document Retrieval is about 5 seconds using a Dell GX-1, 500 mHz, 384 meg.  In a nut shell you have a system like this:

1.  Apache based web server with a document root containing folders and files which are intuitively named so that you SEARCH on file and folder names or substrings.

2.  We used a windows ftp client to create folders and upload files. Originally we put the files out as "tiff", them moved to compressed hpgl.  Now we use adobe exclusively.

3.  A "find" script creates an index file.

4.  A C/C++ script works as a CGI to parse the form (allows multiple search terms/substrings and return hypertext pointing to the files.

I utilized this system successfully at a major DOD company in Duluth, Ga for three years.  I am now at a major company in Lawrenceville and have instituted the same thing.  None of this stuff is really technically hard.  The SEARCH algorithm is a bit sophisticated but could be more simply written in perl but would works a bit less efficiently than the C++ code I eventually put in place.  

Contact me at ***john.armsby at mindspring.com**** if you want some detail.

john


-----Original Message-----
From: Sean Kilpatrick [mailto:kilpatms at mindspring.com]
Sent: Sunday, September 14, 2003 10:17 PM
To: Atlanta Linux Enthusiasts
Subject: Re: [ale] Document Imaging under Linux


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 14 September 2003 08:27 pm, John Wells wrote:
> I have a family member who'd like me to help design/develop or integrate a
> document imaging system under Linux.  He has a large amount of documents
> he'd like to scan in, store, and be able to retrieve easily for his
> company.
> 


I spent part of this evening at my wife's office helping her with
a similar problem: scan in a 26-page document and make it available
to her students.  The software solution of choice (Apple here) was
Adobe PDF genenerating software capable of taking input from a
scanner.  The scanner was slow, (75 sec. a page) so this took a
while but in the end I had a single pdf file of the entire document.

I would really like to find similar software available under Linux --
and it doesn't have to be free. I'm perfectly willing to pay for this
kind of utility.

Sean


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE/ZSEH73hVp4UeGJERAqhHAJ9T19IXjMyGiJIHJmdIPZd+qAokDgCgznwJ
nj9EgoFvLEOHBsP9A8PYvZo=
=Gc1E
-----END PGP SIGNATURE-----

_______________________________________________
Ale mailing list
Ale at ale.org
http://www.ale.org/mailman/listinfo/ale



More information about the Ale mailing list