如何让 Selenium 不等到整个页面加载，它的脚本很慢?

时间：2023-06-05

本文介绍了如何让 Selenium 不等到整个页面加载，它的脚本很慢?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着跟版网的小编来一起学习吧！

问题描述

Selenium driver.get (url) 等待整个页面加载.但是一个抓取页面会尝试加载一些死掉的 JS 脚本.所以我的 Python 脚本等待它并且不能工作几分钟.这个问题可能出现在网站的每个页面上.

Selenium driver.get (url) wait till full page load. But a scraping page try to load some dead JS script. So my Python script wait for it and doesn't works few minutes. This problem can be on every pages of a site.

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.cortinadecor.com/productos/17/estores-enrollables-screen/estores-screen-corti-3000')
# It try load: https://www.cetelem.es/eCommerceCalculadora/resources/js/eCalculadoraCetelemCombo.js 
driver.find_element_by_name('ANCHO').send_keys("100")

如何限制等待时间，阻止文件的AJAX加载，或者其他方式?

How to limit the time wait, block AJAX load of a file, or is other way?

我还在 webdriver.Chrome() 中测试我的脚本，但会使用 PhantomJS()，或者可能是 Firefox().因此，如果某些方法使用了浏览器设置的更改，那么它必须是通用的.

Also I test my script in webdriver.Chrome(), but will use PhantomJS(), or probably Firefox(). So, if some method uses a change in browser settings, then it must be universal.

推荐答案

当 Selenium 默认加载页面/url 时，它遵循默认配置，将 pageLoadStrategy 设置为 normal.为了使 Selenium 不等待整个页面加载，我们可以配置 pageLoadStrategy.pageLoadStrategy 支持以下 3 种不同的值:

When Selenium loads a page/url by default it follows a default configuration with pageLoadStrategy set to normal. To make Selenium not to wait for full page load we can configure the pageLoadStrategy. pageLoadStrategy supports 3 different values as follows:

正常(全页加载)
渴望(交互式)
无

这是配置pageLoadStrategy的代码块:

Here is the code block to configure the pageLoadStrategy :

火狐:

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

caps = DesiredCapabilities().FIREFOX
caps["pageLoadStrategy"] = "normal"  #  complete
#caps["pageLoadStrategy"] = "eager"  #  interactive
#caps["pageLoadStrategy"] = "none"
driver = webdriver.Firefox(desired_capabilities=caps, executable_path=r'C:path	ogeckodriver.exe')
driver.get("http://google.com")

Chrome:

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

caps = DesiredCapabilities().CHROME
caps["pageLoadStrategy"] = "normal"  #  complete
#caps["pageLoadStrategy"] = "eager"  #  interactive
#caps["pageLoadStrategy"] = "none"
driver = webdriver.Chrome(desired_capabilities=caps, executable_path=r'C:path	ochromedriver.exe')
driver.get("http://google.com")

注意 : pageLoadStrategy 值 normal, eager 和 none 是 WebDriver W3C Editor's Draft 但 pageLoadStrategy 值作为 eager 仍然是一个WIP(Work InChromeDriver 实施中的进展).您可以在渴望"页面加载中找到详细讨论Python 中 Chromedriver Selenium 的策略解决方法

Note : pageLoadStrategy values normal, eager and none is a requirement as per WebDriver W3C Editor's Draft but pageLoadStrategy value as eager is still a WIP (Work In Progress) within ChromeDriver implementation. You can find a detailed discussion in "Eager" Page Load Strategy workaround for Chromedriver Selenium in Python

这篇关于如何让 Selenium 不等到整个页面加载，它的脚本很慢?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持跟版网！

上一篇：错误消息:“'chromedriver' 可执行文件需要在路径中可用" 下一篇：WebDriverException:消息:“chromedriver"可执行文件需要在 PATH 中，同时通

如何让 Selenium 不等到整个页面加载，它的脚本很慢?

问题描述

推荐答案

相关文章