• <small id='ggBgY'></small><noframes id='ggBgY'>

      <i id='ggBgY'><tr id='ggBgY'><dt id='ggBgY'><q id='ggBgY'><span id='ggBgY'><b id='ggBgY'><form id='ggBgY'><ins id='ggBgY'></ins><ul id='ggBgY'></ul><sub id='ggBgY'></sub></form><legend id='ggBgY'></legend><bdo id='ggBgY'><pre id='ggBgY'><center id='ggBgY'></center></pre></bdo></b><th id='ggBgY'></th></span></q></dt></tr></i><div id='ggBgY'><tfoot id='ggBgY'></tfoot><dl id='ggBgY'><fieldset id='ggBgY'></fieldset></dl></div>
    1. <legend id='ggBgY'><style id='ggBgY'><dir id='ggBgY'><q id='ggBgY'></q></dir></style></legend>
          <bdo id='ggBgY'></bdo><ul id='ggBgY'></ul>
      1. <tfoot id='ggBgY'></tfoot>
      2. 在Python中拍摄URL屏幕截图的更好方法

        时间:2024-08-21

          <bdo id='nQ5JR'></bdo><ul id='nQ5JR'></ul>
        • <tfoot id='nQ5JR'></tfoot>

              <tbody id='nQ5JR'></tbody>

              <legend id='nQ5JR'><style id='nQ5JR'><dir id='nQ5JR'><q id='nQ5JR'></q></dir></style></legend>

              <small id='nQ5JR'></small><noframes id='nQ5JR'>

            • <i id='nQ5JR'><tr id='nQ5JR'><dt id='nQ5JR'><q id='nQ5JR'><span id='nQ5JR'><b id='nQ5JR'><form id='nQ5JR'><ins id='nQ5JR'></ins><ul id='nQ5JR'></ul><sub id='nQ5JR'></sub></form><legend id='nQ5JR'></legend><bdo id='nQ5JR'><pre id='nQ5JR'><center id='nQ5JR'></center></pre></bdo></b><th id='nQ5JR'></th></span></q></dt></tr></i><div id='nQ5JR'><tfoot id='nQ5JR'></tfoot><dl id='nQ5JR'><fieldset id='nQ5JR'></fieldset></dl></div>
                • 本文介绍了在Python中拍摄URL屏幕截图的更好方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

                  问题描述

                  问题说明

                  当前正在处理一个项目,该项目需要我浏览URL并拍摄网页的屏幕截图。

                  在查看各种资源后,我找到了3种方法。我将介绍我目前使用的所有3种方法。

                  方法1:PhantomJS

                  from selenium import webdriver
                  import time
                  import sys
                  
                  print 'Without Headless'
                  _start = time.time()
                  br = webdriver.PhantomJS()
                  br.get('http://' + sys.argv[1])
                  br.save_screenshot('screenshot-phantom.png')
                  br.quit
                  _end = time.time()
                  print 'Total time for non-headless {}'.format(_end - _start)
                  

                  方法2:无头浏览器

                  from selenium import webdriver
                  from selenium.webdriver.chrome.options import Options
                  
                  print 'Headless'
                  _start = time.time()
                  options = Options()
                  options.add_argument("--headless") # Runs Chrome in headless mode.
                  options.add_argument('--no-sandbox') # # Bypass OS security model
                  options.add_argument('start-maximized')
                  options.add_argument('disable-infobars')
                  options.add_argument("--disable-extensions")
                  driver = webdriver.Chrome(chrome_options=options, executable_path='/usr/bin/chromedriver')
                  driver.get('http://' + sys.argv[1])
                  driver.save_screenshot('screenshot-headless.png')
                  driver.quit()
                  _end = time.time()
                  print 'Total time for headless {}'.format(_end - _start)
                  

                  方法3:pyQt

                  import argparse
                  import sys
                  import logging
                  import sys
                  import time
                  import os
                  
                  import urlparse
                  from selenium import webdriver
                  from PyQt4.QtCore import *
                  from PyQt4.QtGui import *
                  from PyQt4.QtWebKit import *
                  
                  class Screenshot(QWebView):
                      def __init__(self):
                          self.app = QApplication(sys.argv)
                          QWebView.__init__(self)
                          self._loaded = False
                          self.loadFinished.connect(self._loadFinished)
                  
                      def capture(self, url, output_file):
                          _logger.info('Received url {}'.format(url))
                          _start = time.time()
                          try:
                              #Check for http/https
                              if url[0:3] == 'http' or url[0:4] == 'https':
                                  self.url = url
                              else:
                                  url = 'http://' + url
                              self.load(QUrl(url))
                              self.wait_load(url)
                              # set to webpage size
                              frame = self.page().mainFrame()
                              self.page().setViewportSize(frame.contentsSize())
                              # render image
                              image = QImage(self.page().viewportSize(), QImage.Format_ARGB32)
                              painter = QPainter(image)
                              frame.render(painter)
                              painter.end()
                              _logger.info('Saving screenshot {} for {}'.format(output_file,url))
                              image.save(os.path.join(os.path.dirname(os.path.realpath(__file__)),'data',output_file))
                          except Exception as e:
                              _logger.error('Error in capturing screenshot {} - {}'.format(url,e))
                          _end = time.time()
                          _logger.info('Time took for processing url {} - {}'.format(url,_end - _start))
                  
                      def wait_load(self,url,delay=1,retry_count=60):
                          # process app events until page loaded
                          while not self._loaded and retry_count:
                              _logger.info('wait_load for url {} retry_count {}'.format(url,retry_count))
                              self.app.processEvents()
                              time.sleep(delay)
                              retry_count -=1
                          _logger.info('wait_load for url {} expired'.format(url))
                          self._loaded = False
                  
                      def _loadFinished(self, result):
                          self._loaded = True
                  

                  面临的问题:

                  这3种方法在使用过程中,都会因为这样或那样的错误而卡住,这里就问了这样一个问题Error Question on Stackoverflow。 所以在这3种方法中,用Python截取网页截图是非常有效的,而且适用于大规模部署。

                  推荐答案

                  取自https://gist.github.com/fabtho/13e4a2e7cfbfde671b8fa81bbe9359fb,用Python3重写

                  此方法在技术上可行,但效果不佳,因为许多网站都会在每个截图中显示接受Cookie的弹出窗口,因此根据您使用的网站,您可能希望在开始截图过程之前先使用Selenium删除这些弹出窗口。

                  from PIL import Image
                  from io import BytesIO
                  
                  verbose = 1
                  
                  browser = webdriver.Chrome(executable_path='C:/yourpath/chromedriver.exe')
                  browser.get('http://stackoverflow.com/questions/37906704/taking-a-whole-page-screenshot-with-selenium-marionette-in-python')
                  
                  # from here http://stackoverflow.com/questions/1145850/how-to-get-height-of-entire-document-with-javascript
                  js = 'return Math.max( document.body.scrollHeight, document.body.offsetHeight,  document.documentElement.clientHeight,  document.documentElement.scrollHeight,  document.documentElement.offsetHeight);'
                  
                  scrollheight = browser.execute_script(js)
                  
                  if verbose > 0: 
                      print(scrollheight)
                  
                  slices = []
                  offset = 0
                  while offset < scrollheight:
                      if verbose > 0: 
                          print(offset)
                  
                      browser.execute_script("window.scrollTo(0, %s);" % offset)
                      img = Image.open(BytesIO(browser.get_screenshot_as_png()))
                      offset += img.size[1]
                      slices.append(img)
                  
                      if verbose > 0:
                          browser.get_screenshot_as_file('%s/screen_%s.png' % ('/tmp', offset))
                          print(scrollheight)
                  
                  
                  screenshot = Image.new('RGB', (slices[0].size[0], offset))
                  offset = 0
                  for img in slices:
                      screenshot.paste(img, (0, offset))
                      offset += img.size[1]
                  
                  screenshot.save('screenshot.png')
                  browser.quit()```
                  

                  这篇关于在Python中拍摄URL屏幕截图的更好方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

                  上一篇:从范围获取文本返回空字符串 下一篇:如何编辑JupyterLab主题

                  相关文章

                • <tfoot id='eEUUd'></tfoot>
                  <legend id='eEUUd'><style id='eEUUd'><dir id='eEUUd'><q id='eEUUd'></q></dir></style></legend>

                  <small id='eEUUd'></small><noframes id='eEUUd'>

                        <bdo id='eEUUd'></bdo><ul id='eEUUd'></ul>

                    1. <i id='eEUUd'><tr id='eEUUd'><dt id='eEUUd'><q id='eEUUd'><span id='eEUUd'><b id='eEUUd'><form id='eEUUd'><ins id='eEUUd'></ins><ul id='eEUUd'></ul><sub id='eEUUd'></sub></form><legend id='eEUUd'></legend><bdo id='eEUUd'><pre id='eEUUd'><center id='eEUUd'></center></pre></bdo></b><th id='eEUUd'></th></span></q></dt></tr></i><div id='eEUUd'><tfoot id='eEUUd'></tfoot><dl id='eEUUd'><fieldset id='eEUUd'></fieldset></dl></div>