|
In lugnet.general, Micah Jaffe writes:
> [...]
> Some notes: it's pretty dumb about scanning for sets. It picks up any 3-4
> digit numbers and assumes it's a Lego set id. It's also pretty dumb about
> scanning the HTML returned from the lugnet search query, meaning if the
> format of the HTML were to change drastically, then the script is b0rken.
> What can I say, it was something I whipped up while avoiding something less
> interesting at work for a couple hours. It also doesn't try in anyway try
> to deal with id collisions (i.e. if you feed it 6077, it'll just give spit
> it out as an unknown set, called "(???)").
Say, did you know about the "output=plain" option? For example,
http://www.lugnet.com/pause/search/?query=6848&output=plain
It's a simpler output mode which makes parsing easier, and it's much less
bandwidth (about 1/10 the page size). Actually, rather than parsing, it
was really created for the purpose of just dumping the HTTP output to STDOUT.
> This is a work independent of Lugnet and is not affiliated officially in
> any way. As far as I can see this doesn't violate any of the Lugnet Terms
> of Use statement and I've tried to liberally acknowledge that all set
> information is coming from Lugnet.
No, it doesn't directly violate any of the written terms of use, but it's
not always considered good netiquette to write scripts that query other
servers via HTTP, especially repeated bulk queries. If it starts to bog
down the server, I'll have to block requests of that type, so please try
to keep it to reasonable levels. (I've already had to block several poorly
written crawlers/robots which were trying to clomp through all the news
articles via HTTP at high speed -- a bad crawler nono).
If you plan to run queries frequently, I'd suggest downloading local copies
of the tab-delimited listings at <http://www.lugnet.com/pause/lists.html> and
then simply looking the info up from there rather than making across-the-net
HTML queries on-the-fly.
--Todd
|
|
Message has 2 Replies:
Message is in Reply To:
| | Perl-based bulk set IDer...
|
| Hello fellow Lego freaks^H^H^H^H^Hans, (Warning this may appeal to Unix and Perl geeks only...) Not long ago when I was trying in a feverish sort of way to re-establish a Lego collection that I dreamt of as a kid, I put together a Perl script that (...) (25 years ago, 21-Jul-99, to lugnet.general)
|
6 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
This Message and its Replies on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|