Solve hogging problem #29

don-han · 2016-03-11T09:57:44Z

No description provided.

don-han · 2016-03-11T10:29:10Z

Suggested by @alvinwan:

priority
http://doc.scrapy.org/en/latest/topics/request-response.html

alvinwan · 2016-03-12T01:37:53Z

Just for future reference, we could use the priority kwarg to take advantage of the inherent PQ that scrapy has built-in for requests. Here was what I posted Slack:

[2:17]
...since higher priority values correspond to, well, higher priority, just
 take the difference between max_depth and the depth of the current
 page and pass that in as the priority. We take the difference because
 we want higher priority to correspond to lower depth, effecting a bfs
 by page-depth. I don't remember if this is the case, but we'd have to
 enqueue all domains first though, so that it doesn't start bfs... on one
 domain.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solve hogging problem #29

Solve hogging problem #29

don-han commented Mar 11, 2016

don-han commented Mar 11, 2016

alvinwan commented Mar 12, 2016

Solve hogging problem #29

Solve hogging problem #29

Comments

don-han commented Mar 11, 2016

don-han commented Mar 11, 2016

alvinwan commented Mar 12, 2016