[ale] Bash/Python Question

Tim Watts tim at cliftonfarm.org
Tue Mar 16 09:39:17 EDT 2010


If the downloaded files are not interrelated, perhaps you could invoke
download.py and parser.py as a unit on multiple threads or processes.
Then from bash you only need to call gatherer.py.

BTW, if you have a large number of files to download firing off a butt
load of processes all at once isn't necessarily going to give you a
faster result. If we're talking 10's, probably don't worry about; but
100's, use a queue.


On Tue, 2010-03-16 at 02:47 -0400, Omar Chanouha wrote:
> Hey all,
> 
>    I am creating an information gatherer for a school project. I have
> a python file called gatherer that executes a bunch of python
> downloader files. I also have a python file that parses the downloaded
> information and places it into a database. Every day I want to execute
> the following:
> 
> #!/bin/bash
> gatherer.py
> parser.py
> 
> Unfortunately, the gatherer only initializes a bunch of downloader
> scripts. Therefore it exits just after the downloaders are
> initialized, not after they are finished. This means that the parser
> begins executing when the files are being downloaded, which of course
> leads to the parser seeing a bunch of empty files.
> 
> Does anyone have a better solution than executing the parser at a
> constant time after the downloader?
> 
> The gatherer looks something like:
> 
> for file in list:
>  download.py file
> 
> I need the gatherer to work this way because I want the files to
> download in parallel in order to speed up the process.
> 
> Thanks,
> 
> Omar
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://mail.ale.org/mailman/listinfo/ale
> See JOBS, ANNOUNCE and SCHOOLS lists at
> http://mail.ale.org/mailman/listinfo


________
Nearly all men can stand adversity, but if you want to test a man's
character, give him power.
-- Abraham Lincoln




More information about the Ale mailing list