Subject:
|
Re: LEGO instructions
|
Newsgroups:
|
lugnet.lego.direct
|
Date:
|
Fri, 15 Sep 2006 22:13:39 GMT
|
Viewed:
|
13365 times
|
| |
| |
In lugnet.lego.direct, Ross Crawford wrote:
> I poll from Australia, roughly once a month.
Hm. given what you're talking about, it doesn't seem terribly likely that they'd
shut down your IP based on that alone-- unless your polls are REALLY hitting
them hard. But I'm assuming you're doing something like:
for each set in the Lugnet Database:
- Skip the set if its release date is prior to (say) 1995.
- Skip the set if you already have a record for it
- Do a search on Lego for the set number. If a result comes back, save it, but
don't actually download the PDF.
But maybe you're actually testing each PDF to verify that the links are still
valid?
Anyway, I wouldn't expect that the above once a month would do anything terribly
awful to their system. If you're polling once a month, you could even make it
easier on them and set up a cron to pull a page every (say) 5 minutes, but EVERY
day. That'd probably give them less load to worry about, and probably wouldn't
get your DB behind at all.
> Hmmm, interesting. But unfortunately, I get the same server error screen for
> that URL. Which either means they are blocking the same IPs on that server
> too, or it is a temporary cache and it has been flushed (in which case it is
> useless for polling).
I've been suprised at how long things tend to stay on "cache.lego.com", although
admittedly I don't think ANYTHING on their website is safe to consider
"permanent" (URL-wise that is). Both links still work for me (not that that
helps much)
> I do not store the PDFs on Northstar, and I don't want to. There would I'm
> sure be legal issues doing that, and with 9.5GB so far, I'd certainly have to
> pay more for my hosting at Northstar.
Oh, I wasn't meaning that you'd STORE them at NorthStar, but that as a fix
(albeit a very bad one), you could set up a link that used the NS server to pull
FROM Lego (assuming that NS's IP's aren't blocked), then immediately deliver the
file to yourself (assuming that NS isn't blocking your IP). And as soon as the
NS server is done transferring the file, away it goes. But yes, that's possibly
also disallowed, since technically your server would be serving Lego's material
which they've asked people not to do. (But so long as the link isn't *public*,
you're ok)
> If the polling is the reason they have blocked my IP (and a large range of
> other IPs it would seem), I would just like to know. I would be happy to
> discuss with them the possibility of updating my database in a "sanctioned"
> way. But blocking a large range of IPs because of it seems to be like driving
> a nail with a 10 ton truck.
Agreed. It does seem rather... wrong of them to do. Of course, if they're
blocking a wide range of IP's, it could also be unrelated to your polling in
particular, which may be the case, depending on the order of events. I guess
there's not much way to know unless Lego wants to chime in...
DaveE
|
|
Message has 1 Reply: | | Re: LEGO instructions
|
| (...) It's more complicated than that, because the PDFs are stored by the 7 digit LEGO number, not the set number. Basically I scan the header of all files in a certain range (about 1 million), and if it is a PDF I check against my database, (...) (18 years ago, 15-Sep-06, to lugnet.lego.direct)
|
Message is in Reply To:
| | Re: LEGO instructions
|
| (...) I poll from Australia, roughly once a month. (...) that URL. Which either means they are blocking the same IPs on that server too, or it is a temporary cache and it has been flushed (in which case it is useless for polling). (...) I do not (...) (18 years ago, 15-Sep-06, to lugnet.lego.direct)
|
21 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
This Message and its Replies on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|