[ale] [OT] [CONTRACT WORK OFFERRED] Perl Programmer(s) in Atlanta

hbbs at attbi.com hbbs at attbi.com
Tue Apr 8 16:31:02 EDT 2003


I read the description here and I had two thoughts:

1.  Aren't Web sites that are being "scraped" in this fashion suing the
scrapers?  I feel like I've seen that sort of situation described on Slashdot
(so it HAS to be true, right?).

2.  Barring that, I see a seriously flawed assumption - that the Web sites being
scraped will stay the same, i.e., will lend themselves to consistent, accurate
scraping over time.  If I were running one of these sites and I didn't want to
be scraped, I'd have my guys muck about with the code so as to alter the HTML
but without significantly altering the visible result at the browser.  Even so,
the kinds of routine or sweeping Web site alterations that one sees on most any
e-commerce site (e.g., Expedia, eBay, 1-800-CONTACTS) would likely break the
scraping every time.  So, it seems to me that even if you got all this perfectly
coded at some point, it would almost always be broken by no fault of your own. 
It's job security, sure, but who wants to rely on a scrape site that is
continually breaking?

- Jeff
> Hi,
> 
> I posted this offer of Perl development contract work to the Atlanta.pm
> mailing list (Atlanta Perl Mongers), and Keith Watson suggested I should
> post it here too.
> 
> Please read below if you are (1) a Perl developer with good experience,
> (2) available and interested in some contract work.
> 
> Stephen
> 
> At 02:26 PM 4/8/2003 -0400, you wrote:
> >Hi,
> >
> >We need to employ one or more Perl contractors in Atlanta soon.
> >(part time or full time)
> >
> >I sent out a message earlier about this (and got one qualified response),
> >but the requirements have changed a bit, so I am sending this again.
> >
> >CONTRACT DESCRIPTION
> >
> >We have developed a framework for searching Hotel web sites and Car Rental
> >web sites to find the available rates which they are offering.  We navigate
> >the sites, parse the HTML, and return structured data as though we were
> >reading from a database.
> >
> >We are currently scanning Travelocity (http://www.travelocity.com) in this 
> >way.
> >
> >We need additional Perl programmers to develop modules for other web sites
> >using the approach and the framework which we have demonstrated works well
> >for Travelocity.
> >
> >We are using a combination of WWW::Mechanize (built on LWP::UserAgent),
> >HTML::TokeParser, and the App-Context, App-Repository, and App-Widget
> >distributions from the P5EE project.  (see the following references)
> >
> >http://www.perl.com/pub/a/2003/01/22/mechanize.html
> >http://search.cpan.org/author/PETDANCE/WWW-Mechanize/lib/WWW/Mechanize.pm
> >http://search.cpan.org/author/GAAS/HTML-Parser/lib/HTML/TokeParser.pm
> >http://www.officevision.com/pub/p5ee/
> >
> >COMPANY BACKGROUND
> >
> >We are a small Internet ASP (Application Service Provider) that has
> >about 8 employees and is currently *profitable*.  We are growing the
> >company slowly and surely.
> >
> >We serve the travel industry to help them understand their competitive
> >position in the marketplace (i.e. the prices they offer, relative to the
> >prices offered by their competitors).
> >
> >WHAT TO DO
> >
> >If are interested and available, please email me back directly (not through
> >the Atlanta.pm list)
> >Please send your resume too of course.
> >And let me know what your availability to work is:
> >    * how soon? how much? (full time? part time?)
> >
> >The ideal candidate(s) would have good experience and skill in software
> >development in perl, have good experience with web applications/HTML/HTTP,
> >be available for full-time work soon, and be interested in this engagement
> >turning into full-time employment in due time if all went well.
> >
> >Stephen
> >
> >
> >
> >
> >
> >_______________________________________________
> >Atlanta-pm mailing list
> >Atlanta-pm at mail.pm.org
> >http://mail.pm.org/mailman/listinfo/atlanta-pm
> 
> -------------
> 
> Keith R. Watson                        GTRI/ITD
> Systems Support Specialist III         Georgia Tech Research Institute
> keith.watson at gtri.gatech.edu           Atlanta, GA  30332-0816
> 404-894-0836
> 
> 
> 
> 
> _______________________________________________
> Ale mailing list
> Ale at ale.org
> http://www.ale.org/mailman/listinfo/ale
_______________________________________________
Ale mailing list
Ale at ale.org
http://www.ale.org/mailman/listinfo/ale





More information about the Ale mailing list