Skip to content Skip to sidebar Skip to footer

Scrapy Returns More Results Than Expected

This is a continuation of the question: Extract from dynamic JSON response with Scrapy I have a Scrapy spider that extract values from a JSON response. It works well, extract the r

Solution 1:

The parse function should be like this:

def parse(self, response):
    jsonresponse = json.loads(response.body_as_unicode())
    item = WhoisItem()
    domain_name = list(jsonresponse['domains'].keys())[0]
    item["avail"] = jsonresponse["domains"][domain_name]["avail"]
    item["domain"] = domain_name
    yield item

Notice that I removed the for loop.

What was happening: for every single response you would loop and parse it 17 times. (Therefore resulting in 17*17 records)

Post a Comment for "Scrapy Returns More Results Than Expected"