我是刮毛的新手,我有一个问题。我在搜集Worldeter的Covid数据。因为它是动态的-我正在用硒做这件事。
代码如下:
from selenium import webdriver
import time
URL = "https://www.worldometers.info/coronavirus/"
# Start the Driver
driver = webdriver.Chrome(executable_path = r"C:Webdriverchromedriver.exe")
# Hit the url and wait for 10 seconds.
driver.get(URL)
time.sleep(10)
#find class element
data= driver.find_elements_by_class_name("odd" and "even")
#for loop
for d in data:
country=d.find_element_by_xpath(".//*[@id='main_table_countries_today']").text
print(country)
当前输出:
NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":".//*[@id='main_table_countries_today']"}
(Session info: chrome=96.0.4664.45)
要在worldometers covid data中刮取表,您需要为visibility_of_element_located()归纳WebDriverWait,使用DataFrame从Pandas可以使用以下Locator Strategy:
挡路代码:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd
options = Options()
options.add_argument("start-maximized")
s = Service('C:\BrowserDrivers\chromedriver.exe')
driver = webdriver.Chrome(service=s, options=options)
driver.get("https://www.worldometers.info/coronavirus/")
data = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "table#main_table_countries_today"))).get_attribute("outerHTML")
df = pd.read_html(data)
print(df)
driver.quit()
控制台输出:
[ # Country,Other TotalCases NewCases ... Deaths/1M pop TotalTests Tests/ 1M pop Population
0 NaN World 264359298 632349.0 ... 673.3 NaN NaN NaN
1 1.0 USA 49662381 89259.0 ... 2415.0 756671013.0 2267182.0 3.337495e+08
2 2.0 India 34609741 3200.0 ... 336.0 643510926.0 459914.0 1.399198e+09
3 3.0 Brazil 22118782 12910.0 ... 2865.0 63776166.0 297051.0 2.146975e+08
4 4.0 UK 10329074 53945.0 ... 2124.0 364875273.0 5335159.0 6.839070e+07
.. ... ... ... ... ... ... ... ... ...
221 221.0 Samoa 3 NaN ... NaN NaN NaN 2.002800e+05
222 222.0 Saint Helena 2 NaN ... NaN NaN NaN 6.103000e+03
223 223.0 Micronesia 1 NaN ... NaN NaN NaN 1.167290e+05
224 224.0 Tonga 1 NaN ... NaN NaN NaN 1.073890e+05
225 NaN Total: 264359298 632349.0 ... 673.3 NaN NaN NaN
[226 rows x 15 columns]]
这篇关于抓取动态数据硒-无法定位元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!