问问大神，scrapy代码为啥爬出来的数据重复了呢？

m1594***

我的结果示例如下，“元谷”重复了5次，其他的也重复了4次（而且我爬的时候后面还出现了429禁止我爬取）。补:后来再试试发现不管多层爬取的事，把loupan_detail_parse去掉也出现重复，而且loupan_item不能print出来

控制台上显示重复了两次

我的代码如下
# -*- coding: utf-8 -*-import scrapyfrom papa.items import PapaItemimport reclass PappSpider(scrapy.Spider): name = 'papp' allowed_domains = ['xa.fang.ke.com'] start_urls = ['https://xa.fang.ke.com/loupan/nhs1/'] count=1 page_end=48 def parse(self, response): loupan_iist=response.xpath("//div[@class='resblock-desc-wrapper']") for i_item in loupan_iist: loupan_item=PapaItem() quwei =i_item.xpath(".//a[@class='resblock-location']/text()").extract() loupan_item['quwei'] = quwei[1].replace("\t", "").replace("\n", "") loupan_item['loupan_name'] = i_item.xpath(".//div[@class='resblock-name']/a/text()").extract() loupan_item['resblock_type'] = i_item.xpath(".//div[@class='resblock-name']/span[1]/text()").extract() loupan_item['loupan_type'] = i_item.xpath(".//div[@class='resblock-name']/span[2]/text()").extract() loupan_item['resblock_type']= i_item.xpath(".//div[@class='resblock-tag']/span/text()").extract() loupan_tag=i_item.xpath(".//div[@class='resblock-tag']/span/text()").extract() loupan_item['loupan_tag']="/".join(loupan_tag) loupan_item['jun_jia']=i_item.xpath(".//div[@class='resblock-price']/div[@class='main-price']/span[@class='number']/text()").extract() xiangqing_url=i_item.xpath(".//div[@class='resblock-name']/a/@href").extract() xiang_url='https://'+self.allowed_domains[0]+xiangqing_url[0]+"xiangqing" yield scrapy.Request(xiang_url, meta={'item': loupan_item}, callback=self.loupan_detail_parse) self.count = self.count + 1 if self.count

已有账号？

找人解决需求

问问大神，scrapy代码为啥爬出来的数据重复了呢？

热门问答

m1594***

今日需求悬赏

今日问答求助

发布任务需求已有1031166位用户正在使用天盟网服务

新手帮助

平台规则

关于天盟

一键快捷导航

微信公众号

手机客户端