1. <nobr id="easjo"><address id="easjo"></address></nobr>

      <track id="easjo"><source id="easjo"></source></track>
      1. 
        

      2. <bdo id="easjo"><optgroup id="easjo"></optgroup></bdo>
      3. <track id="easjo"><source id="easjo"><em id="easjo"></em></source></track><option id="easjo"><span id="easjo"><em id="easjo"></em></span></option>
          貴州做網站公司
          貴州做網站公司~專業!靠譜!
          10年網站模板開發經驗,熟悉國內外開源網站程序,包括DEDECMS,WordPress,ZBlog,Discuz! 等網站程序,可為您提供網站建設,網站克隆,仿站,網頁設計,網站制作,網站推廣優化等服務。我們專注高端營銷型網站,企業官網,集團官網,自適應網站,手機網站,網絡營銷,網站優化,網站服務器環境搭建以及托管運維等。為客戶提供一站式網站解決方案?。?!

          海賊王最新漫畫圖片_海賊王漫畫52pk

          來源:互聯網轉載 時間:2023-12-10 16:54:43

          制作工具模塊

          1. -隱藏身份信息的User-Agent模塊;對象服務器識別不了身份信息。
          import random
          user_agent_data = [
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3314.0 Safari/537.36 SE 2.X MetaSr 1.0"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36 QIHU 360SE"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.162 Safari/537.36"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3722.400 QQBrowser/10.5.3751.400"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18363"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.70 Safari/537.36"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3765.400 QQBrowser/10.6.4153.400"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.25 Safari/537.36 Core/1.70.3765.400 QQBrowser/10.6.4153.400"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.204 Safari/537.36"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; ServiceUI 14) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362"},
          { 
          
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36"},
          { 
          "User-Agent": "Mozilla/5.0 (Windows NT 10.0; …) Gecko/20100101 Firefox/77.0"},
          ]
          def get_headers():
          """隨機獲取報頭"""
          index = random.randint(0,len(user_agent_data)-1)
          # print("下標值:",index)
          return user_agent_data[index]
          if __name__ == '__main__':
          headers = get_headers()
          print("隨機獲取UA值:",headers)
          
          1. 制作一個動態的IP代理池;防止IP被封;可以用的ip代理已存進ippool.json
          import json
          import random
          def get_proxies():
          """隨機獲取代理池"""
          #讀取文件
          rfile = open("./ipfile/ippool.json","r",encoding="utf-8")
          proxy_lists = json.load(rfile)
          rfile.close()
          # print(len(proxy_lists))
          #隨機數
          index = random.randint(0,len(proxy_lists)-1)
          return proxy_lists[index]
          if __name__ == '__main__':
          proxies = get_proxies()
          print("隨機獲取ip代理:",proxies)
          

          爬取漫畫首頁的數據內容

          1.http://kanbook.net/328
          2.爬取字段標題、頁數、herf后綴 并存進到json

          import requests
          import useragenttool
          import proxytool
          from lxml import etree
          import json
          import os
          class OnePieceSpider(object):
          def __init__(self):
          # 初始化
          self.url = "http://kanbook.net/328"
          self.html_data = None
          self.one_piece_data_list = []
          def get_url_html(self):
          """解析獲得網址源代碼"""
          headers = useragenttool.get_headers()
          # 添加報頭,隱藏身份
          headers["Accept-Encoding"] = "deflate, sdch, br"
          headers["Content-Type"] = "text/html; charset=UTF-8"
          headers["Referer"] = "https://kanbook.net/328/3/1/1"#參考點
          # print(headers)
          # 請求響應
          response = requests.get(url=self.url,
          headers=headers,
          proxies=proxytool.get_proxies())
          html_content = response.content.decode("utf-8")
          self.html_data = html_content
          # print(html_content)
          def catch_html_data(self):
          """抓取網址源代碼的數據"""
          # 獲得etree對象
          data_parse = etree.HTML(self.html_data)
          # print(data_parse)
          li_list = data_parse.xpath("http://p[@aria-labelledby='3-tab']/ol/li")
          # print(li_list)
          # 遍歷處理,列表倒置
          for li_element in li_list[::-1]:
          # print(li_element)
          # 提取后的鏈接
          h_name = li_element.xpath("./a/@href")[0]
          # print(h_name)
          title = li_element.xpath("./a/@title")[0]
          # 提取標題
          # print(title)
          # 提取頁數
          page = int(li_element.xpath("./a/span/text()")[0][1:4])
          # print(page)
          # 放進字典中
          one_piece_item = { 
          
          "title": title,
          "postfix": h_name,
          "page": page
          }
          # print(one_piece_item)
          self.one_piece_data_list.append(one_piece_item)
          print("添加成功!")
          def save_data_file(self):
          """保存信息"""
          path = "./image_url"
          if not os.path.exists(path):
          os.mkdir(path)
          file = open(path + "/one_piece_data.json", "w", encoding="utf-8")
          json.dump(self.one_piece_data_list, file, ensure_ascii=False, indent=2)
          file.close()
          print("數據保存成功!")
          def run(self):
          # 啟動程序
          self.get_url_html()
          # print(html_content)
          self.catch_html_data()
          self.save_data_file()
          # print(self.one_piece_data_list)
          def main():
          spider = OnePieceSpider()
          spider.run()
          

          開始爬取海賊王全部的全彩漫畫圖片

          -注意點:報頭要添加referer參考頁,選擇漫畫本站
          此外循環(while True)為了讓全部卷圖片都能下載成功,成功下載就跳出循環

          import requests
          import useragenttool
          import proxytool
          import time
          import random
          import json
          import os
          import re
          import urllib3
          urllib3.disable_warnings()
          class OnePieceImageSpider(object):
          def __init__(self):
          # 初始化
          self.url = ""
          def set_url(self, out_url):
          """設置網絡地址"""
          self.url = out_url
          def get_url_list(self, num):
          """獲取num頁網址"""
          url_list = []
          # 拼接網址,獲得列表
          for page in range(1, num+1):
          new_url = self.url.format(page)
          url_list.append(new_url)
          return url_list
          def get_url_html(self, inner_url):
          """解析獲得網址源代碼"""
          headers = useragenttool.get_headers()
          headers["Accept-Encoding"] = "deflate, sdch, br"
          headers["Content-Type"] = "text/html; charset=UTF-8"
          headers["Referer"] = "https://kanbook.net/328/3/6"#參照頁
          # print(headers)
          response = requests.get(url=inner_url,
          headers=headers,
          proxies=proxytool.get_proxies(),
          timeout=30,
          verify=False)
          # 動態限制爬取網頁源代碼時間
          wait_time = random.randint(1, 6)
          time.sleep(wait_time)
          html_content = response.content
          # print(html_content)
          return html_content
          def __download_image(self, image_url, name, index):
          """ 下載圖片 :param image_url: 圖片地址 :param name: 文件名字 :param index: 圖片數字 :return: """
          while True:
          try:
          if len(image_url) == 0:
          break
          content = self.get_url_html(image_url)
          path = "./onepieceimage/%s" % name
          if not os.path.exists(path):
          os.mkdir(path)
          with open(path + "/%d.jpg" % index, "wb") as wfile:
          wfile.write(content)
          break
          except Exception as msg:
          print("出現異常,錯誤信息為", msg)
          # 啟動程序
          def run(self,url_list, title):
          # print(url_list)
          # 遍歷處理,獲得html
          index = 2
          for url in url_list:
          while True:
          try:
          # print(url)
          data = self.get_url_html(url).decode("utf-8")
          # print(data)
          regex = r"""var img_list=(\[.+])"""
          result = re.findall(regex, data)
          # print(type(result[0]))
          # 轉換列表
          lists = json.loads(result[0])
          # print(lists)
          img_url = lists[0]
          print(img_url)
          break
          except Exception as msg:
          print("錯誤信息:",msg)
          self.__download_image(img_url, title, index)
          print("第%d張下載" % index)
          index += 1
          print("所有圖片下載成功")
          def main():
          # 提取文件
          read_file = open("./image_url/one_piece_data.json","r",encoding="utf-8")
          one_piece_data = json.load(read_file)
          read_file.close()
          # 遍歷處理,提取字典數據
          for element in one_piece_data:
          # print(element)
          # 海賊王地址、頁數、標題
          href_name = element["postfix"]
          number = element["page"]
          name = element["title"]
          # 拼接網址
          http_url = "http://kanbook.net"+href_name+"/{}"
          # print(http_url)
          onepieceimgspider = OnePieceImageSpider()
          onepieceimgspider.set_url(http_url)
          print("%s開始下載!" % name)
          url_list = onepiecespider.get_url_list(number)
          # print(url_list)
          # 獲得每頁的url列表
          onepieceimgspider.run(url_list, name)
          if __name__ == '__main__':
          main()
          

          保存的格式:

          網絡推廣與網站優化公司(網絡優化與推廣專家)作為數字營銷領域的核心服務提供方,其價值在于通過技術手段與策略規劃幫助企業提升線上曝光度、用戶轉化率及品牌影響力。這...

          在當今數字化時代,公司網站已成為企業展示形象、傳遞信息和開展業務的重要平臺。然而,對于許多公司來說,網站建設的價格是一個關鍵考量因素。本文將圍繞“公司網站建設價...

          在當今的數字化時代,企業網站已成為企業展示形象、吸引客戶和開展業務的重要平臺。然而,對于許多中小企業來說,高昂的網站建設費用可能會成為其發展的瓶頸。幸運的是,隨...

          hdtunepro漢化版怎么用hdtune中文專業版教程?1. 單擊“開始”控制面板添加或刪除程序,然后查找要更改/刪除的程序。2打開并啟動--run--regedit--Ctrl F--在搜索欄中輸入孩子的桌面--直接從軟件的注冊表信息中刪除它。然后直接刪除文件夾和便箋。...

          天津梅江屬于哪個區 天津梅江屬于哪個區?天津梅江會展中心在哪? 天津梅江屬于天津市河西區!位于天津市南部,河西區、津南區、西青區交匯處。地理位置優越:距市中心區6公里,高教區8公里,濱海國際機場21公里,津滬高速公路交叉口14公里,津京塘高速公路交叉口23公里。所有這些都使梅江成為一個具有巨大升值潛力的地區。 天津梅江會展中心在哪里? 天津梅江會展中心屬于天津西青區。梅江會展中心位于西青區...

          臺球裁判薪水多少?臺球裁判的平均年薪應該在10萬歐元左右。臺球裁判的工資也是分等級的,世界級的,國家級的,相差好幾倍!(注:國內外臺球裁判工資換算以歐元計算,與球員一致。這是我個人的看法。)以下是臺球裁判員的選拔程序:臺球裁判從三級裁判開始,達到國際。獲得推薦人 s卡需要國家體育部門審核。獲得臺球裁判的主要條件如下:1.你需要一定的切球經驗,裁判經驗的要求根據裁判的水平不同而不同。想要通過國家二級...

          TOP
          国产初高中生视频在线观看|亚洲一区中文|久久亚洲欧美国产精品|黄色网站入口免费进人
          1. <nobr id="easjo"><address id="easjo"></address></nobr>

              <track id="easjo"><source id="easjo"></source></track>
              1. 
                

              2. <bdo id="easjo"><optgroup id="easjo"></optgroup></bdo>
              3. <track id="easjo"><source id="easjo"><em id="easjo"></em></source></track><option id="easjo"><span id="easjo"><em id="easjo"></em></span></option>