[ale] Bash/Python Question
Tim Watts
tim at cliftonfarm.org
Tue Mar 16 09:39:17 EDT 2010
If the downloaded files are not interrelated, perhaps you could invoke
download.py and parser.py as a unit on multiple threads or processes.
Then from bash you only need to call gatherer.py.
BTW, if you have a large number of files to download firing off a butt
load of processes all at once isn't necessarily going to give you a
faster result. If we're talking 10's, probably don't worry about; but
100's, use a queue.
On Tue, 2010-03-16 at 02:47 -0400, Omar Chanouha wrote:
> Hey all,
>
> I am creating an information gatherer for a school project. I have
> a python file called gatherer that executes a bunch of python
> downloader files. I also have a python file that parses the downloaded
> information and places it into a database. Every day I want to execute
> the following:
>
> #!/bin/bash
> gatherer.py
> parser.py
>
> Unfortunately, the gatherer only initializes a bunch of downloader
> scripts. Therefore it exits just after the downloaders are
> initialized, not after they are finished. This means that the parser
> begins executing when the files are being downloaded, which of course
> leads to the parser seeing a bunch of empty files.
>
> Does anyone have a better solution than executing the parser at a
> constant time after the downloader?
>
> The gatherer looks something like:
>
> for file in list:
> download.py file
>
> I need the gatherer to work this way because I want the files to
> download in parallel in order to speed up the process.
>
> Thanks,
>
> Omar
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo
________
Nearly all men can stand adversity, but if you want to test a man's
character, give him power.
-- Abraham Lincoln
More information about the Ale
mailing list