Answers:
您可以使用
driver.execute_script("window.scrollTo(0, Y)")
其中Y是高度(在全高清显示器上为1080)。(感谢@lukeis)
您也可以使用
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
滚动到页面底部。
如果您想滚动到无限加载的页面,例如社交网络页面,facebook等(感谢@Cuong Tran)
SCROLL_PAUSE_TIME = 0.5
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
另一种方法(感谢Juanse)是,选择一个对象,然后
label.sendKeys(Keys.PAGE_DOWN);
scrollHeight
,这是什么意思,它一般如何工作?
如果要向下滚动到无限页面的底部(例如linkedin.com),可以使用以下代码:
SCROLL_PAUSE_TIME = 0.5
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
SCROLL_PAUSE_TIME
不同而不同,大约需要2秒我。
您可以send_keys
用来模拟END
(或PAGE_DOWN
)按键(通常会滚动页面):
from selenium.webdriver.common.keys import Keys
html = driver.find_element_by_tag_name('html')
html.send_keys(Keys.END)
element=find_element_by_xpath("xpath of the li you are trying to access")
element.location_once_scrolled_into_view
当我尝试访问不可见的“ li”时,这很有帮助。
location_once_scrolled_into_view
不调用它,()
是因为它location_once_scrolled_into_view
是Python property
。在此处查看源代码:selenium / webelement.py,网址为d3b6ad006bd7dbee59f8539d81cee4f06bd81d64·SeleniumHQ / selenium
这些答案都不适合我,至少不是向下滚动Facebook搜索结果页面有效,但经过大量测试,我发现此解决方案:
while driver.find_element_by_tag_name('div'):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
Divs=driver.find_element_by_tag_name('div').text
if 'End of Results' in Divs:
print 'end'
break
else:
continue
SCROLL_PAUSE_TIME
在stackoverflow.com/a/27760083/7326714中将设置为2
,则效果很好,并且向下滚动速度提高了100倍。
使用youtube时,浮动元素的滚动高度为“ 0”,因此请不要使用“ return document.body.scrollHeight”,而是尝试使用此“ return document.documentElement.scrollHeight” ,根据您的互联网调整滚动暂停时间速度,否则它将只运行一次,然后在此之后中断。
SCROLL_PAUSE_TIME = 1
# Get scroll height
"""last_height = driver.execute_script("return document.body.scrollHeight")
this dowsnt work due to floating web elements on youtube
"""
last_height = driver.execute_script("return document.documentElement.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0,document.documentElement.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.documentElement.scrollHeight")
if new_height == last_height:
print("break")
break
last_height = new_height
我正在寻找一种滚动浏览动态网页的方法,并在到达页面末尾并发现该线程时自动停止。
@Cuong Tran的帖子进行了主要修改,是我正在寻找的答案。我认为其他人可能会发现此修改很有用(它对代码的工作方式有明显影响),因此,本文发布了。
修改是移动捕获循环内最后一页高度的语句(以便使每项检查都与上一页高度进行比较)。
因此,下面的代码:
连续向下滚动动态网页(
.scrollTo()
),仅在一次迭代中页面高度保持不变时停止。
(还有另一种修改,其中break语句位于另一个可以删除的条件内(如果页面为“ sticks”)。
SCROLL_PAUSE_TIME = 0.5
while True:
# Get scroll height
### This is the difference. Moving this *inside* the loop
### means that it checks if scrollTo is still scrolling
last_height = driver.execute_script("return document.body.scrollHeight")
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
# try again (can be removed)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)
# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
# check if the page height has remained the same
if new_height == last_height:
# if so, you are done
break
# if not, move on to the next loop
else:
last_height = new_height
continue
该代码滚动到底部,但不需要您每次都等待。它会不断滚动,然后在底部停止(或超时)
from selenium import webdriver
import time
driver = webdriver.Chrome(executable_path='chromedriver.exe')
driver.get('https://example.com')
pre_scroll_height = driver.execute_script('return document.body.scrollHeight;')
run_time, max_run_time = 0, 1
while True:
iteration_start = time.time()
# Scroll webpage, the 100 allows for a more 'aggressive' scroll
driver.execute_script('window.scrollTo(0, 100*document.body.scrollHeight);')
post_scroll_height = driver.execute_script('return document.body.scrollHeight;')
scrolled = post_scroll_height != pre_scroll_height
timed_out = run_time >= max_run_time
if scrolled:
run_time = 0
pre_scroll_height = post_scroll_height
elif not scrolled and not timed_out:
run_time += time.time() - iteration_start
elif not scrolled and timed_out:
break
# closing the driver is optional
driver.close()
这比每次等待0.5-3秒等待响应要快得多,因为该响应可能需要0.1秒
滚动加载页面。示例:中,定额等
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight-1000);")
# Wait to load the page.
driver.implicitly_wait(30) # seconds
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
# sleep for 30s
driver.implicitly_wait(30) # seconds
driver.quit()