Scrapy splash 在搜索使用 JS 加载的商品时无法正常工作

2020-1-11 21:5:37

收藏：0

阅读：154

评论：1

我正在使用 scrapy 和 scrapy splash 从某些 URL 中获取数据，例如这个[产品地址] (https://www.tottus.cl/tottus/product/CASA-JOVEN/Hielera-Botellas-Metal,_/20420748)或这个[产品地址 2] (https://www.tottus.cl/tottus/product/KRYZPO/Papas-Fritas-Original/20261602).

我有一个带有等待时间并返回 HTML 的 Lua 脚本：

script = """
            function main(splash)
              assert(splash:go(splash.args.url))
              assert(splash:wait(4))
              return splash:html()
            end
"""

然后我执行它。

yield SplashRequest(url, self.parse_item, args={'lua_source': script},endpoint='execute')

从这里开始，我需要三个元素，它们是三个不同的产品价格。这三个都是用 JS 加载的。

[！[价格]（https://i.stack.imgur.com/Sw6n5.png）]（https://i.stack.imgur.com/Sw6n5.png）

我已经获得了获取三个元素的xpath。但问题是有时它工作，有时它不工作

    price_strikethrough = response.xpath('//div[@class="price-selector"]/div[@class="prices"]/span[contains(@class,"active-price strikethrough")]/span[1]/text()').extract_first()
    price_offer1 = response.xpath('//div[@class="price-selector"]/div[@class="prices"]/div[contains(@class,"precioDescuento")][1]/text()').extract_first()
    price_offer2 = response.xpath('//div[@class="price-selector"]/div[@class="prices"]/div[contains(@class,"precioDescuento")][2]/text()').extract_first()

我不知道还能做什么来使它正常工作。我尝试更改等待值，但是结果一样。有时它运行良好，有时我无法获得数据。我该怎么做才能确保始终获得所需的数据？

用户5736696

将下面翻译成中文并且保留原本的 markdown 格式

There is nothing wrong with your approach but the issue seems to be on the website. It is taking a variable time for calculating prices by the site, you need to update the time in your `lua_script` it should be around 7 to 8 seconds.

你的方法没有问题，但问题似乎出现在网站上。网站计算价格所需的时间是不确定的，你需要在 lua_script 中更新时间，应该大约为 7 到 8 秒。

2020-01-16 12:05:31

评论区的留言会收到邮件通知哦~

作者:

用户10853054

Scrapy splash 在搜索使用 JS 加载的商品时无法正常工作

社区规范

发文指南

社区文章

开源项目 & 应用

🎮 游戏开发

World of Warcraft

Roblox

Defold

LÖVE 2D

🌐 高性能网络与 Web 服务

OpenResty

Kong

Redis

Nmap

LuaJIT

Wapiti

Wireshark

⚙️ 嵌入式系统与应用工具

LuatOS

TeX Live

Awesome WM

Vim/Neovim

FFmpeg

🧠 人工智能与科学计算

Torch

SciLua