好多小伙伴在Scrapy伪装成随机浏览器时,学习伪装浏览器但没开启中间件。现在博主利用空闲时间现在出个完整的教程。 Scrapy伪装成随机浏览器实现过程如下,需要在middlewares.py代码中添加以下代码,可以Scrapy伪装成随机浏览器,不用担心反爬限制了。小伙伴们可以参考一下。
#导入UserAgenMiddleware组件模块
from scrapy.downloadermiddlewares.useragent import UserAgentMiddleware
from fake_useragent import UserAgent
#设置随机设置user-agent
class QidianHotUserAgentMiddleware(UserAgentMiddleware):#继承UserAgentMiddleware
def process_request(self, request, spider):
ua = UserAgent()
#生成随机的UserAgent
request.headers['User-Agent'] = ua.random
print(request.headers['User-Agent'])
上面完成伪装成随机浏览器,在settings.py中关闭启用中间件QidianHotUserAgentMiddleware。代码如下。
DOWNLOADER_MIDDLEWARES = {
'qidian_hot.middlewares.QidianHotDownloaderMiddleware': None,
'qidian_hot.middlewares.QidianHotUserAgentMiddleware':100
}