当前位置:网站首页>Reptile efficiency improvement method
Reptile efficiency improvement method
2022-04-23 18:02:00 【Round programmer】
coroutines : In function ( Special functions ) When defining , Use async modification , After function call , Internal statements do not execute immediately , It will return a process object
Task object : Task object = Advanced collaboration objects ( Further encapsulation )= Special functions , The task object must be registered in the time cycle object , Bind callback to task object : Crawler data analysis
The event loop : As a container for loading task objects , When the event loop object is started , The task object stored inside will execute asynchronously
First flask service
from flask import Flask
import time
app = Flask(__name__)
@app.route('/ Zhang San ')
def index_bobo():
time.sleep(2)
return 'hello Zhang San !'
@app.route('/ Li Si ')
def index_jay():
time.sleep(2)
return 'hello Li Si !'
@app.route('/ Wang Wu ')
def index_tom():
time.sleep(2)
return 'hello Wang Wu !'
if __name__ == '__main__':
app.run(threaded=True)
One ,aiohttp modular + Single thread multitask asynchronous coroutine
import asyncio
import aiohttp
import requests
import time
start = time.time()
async def get_page(url):
# page_text = requests.get(url=url).text
# print(page_text)
# return page_text
async with aiohttp.ClientSession() as s: # Generate a session object
async with await s.get(url=url) as response:
page_text = await response.text()
print(page_text)
return page_text
urls = [
'http://127.0.0.1:5000/ Zhang San ',
'http://127.0.0.1:5000/ Li Si ',
'http://127.0.0.1:5000/ Wang Wu ',
]
tasks = []
for url in urls:
c = get_page(url)
task = asyncio.ensure_future(c)
tasks.append(task)
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))
end = time.time()
print(end-start)
Two ,aiohttp Module implements single thread + Multitask asynchronous process
import aiohttp
import asyncio
from lxml import etree
import time
start = time.time()
# Special functions : Request sending and data capture
# Be careful async with await keyword
async def get_request(url):
async with aiohttp.ClientSession() as s:
async with await s.get(url=url) as response:
page_text = await response.text()
return page_text # Return to page source code
# Callback function , Parsing data
def parse(task):
page_text = task.result()
tree = etree.HTML(page_text)
msg = "".join(tree.xpath('//text()'))
print(msg)
urls = [
'http://127.0.0.1:5000/ Zhang San ',
'http://127.0.0.1:5000/ Li Si ',
'http://127.0.0.1:5000/ Wang Wu ',
]
tasks = []
for url in urls:
c = get_request(url)
task = asyncio.ensure_future(c)
task.add_done_callback(parse) # Bind callback function !
tasks.append(task)
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))
end = time.time()
print(end-start)
3、 ... and ,requests modular + Thread pool
import time
import requests
from multiprocessing.dummy import Pool
start = time.time()
urls = [
'http://127.0.0.1:5000/ Zhang San ',
'http://127.0.0.1:5000/ Li Si ',
'http://127.0.0.1:5000/ Wang Wu ',
]
def get_request(url):
page_text = requests.get(url=url).text
print(page_text)
return page_text
pool = Pool(3)
pool.map(get_request, urls)
end = time.time()
print(' Total time :', end-start)
版权声明
本文为[Round programmer]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/04/202204230545315261.html
边栏推荐
- 587. 安装栅栏 / 剑指 Offer II 014. 字符串中的变位词
- Crawl the product data of cicada mother data platform
- Cloud native Virtualization: building edge computing instances based on kubevirt
- Selenium + phantom JS crack sliding verification 2
- Docker 安装 MySQL
- JS high frequency interview questions
- ROS package NMEA_ navsat_ Driver reads GPS and Beidou Positioning Information Notes
- Go对文件操作
- QTableWidget使用讲解
- ArcGIS table to excel exceeds the upper limit, conversion failed
猜你喜欢

Data stream encryption and decryption of C

Go language JSON package usage

Anchor location - how to set the distance between the anchor and the top of the page. The anchor is located and offset from the top

Implementation of object detection case based on SSD

Cloud native Virtualization: building edge computing instances based on kubevirt

cv_ Solution of mismatch between bridge and opencv

2022 tea artist (primary) examination simulated 100 questions and simulated examination

MySQL 中的字符串函数

.105Location

C# 网络相关操作
随机推荐
C language input and output (printf and scanf functions, putchar and getchar functions)
Laser slam theory and practice of dark blue College Chapter 3 laser radar distortion removal exercise
Tensorflow tensor introduction
String function in MySQL
Read excel, int digital time to time
proxy server
2022 tea artist (primary) examination simulated 100 questions and simulated examination
Summary of floating point double precision, single precision and half precision knowledge
Add animation to the picture under V-for timing
Use of list - addition, deletion, modification and query
Anchor location - how to set the distance between the anchor and the top of the page. The anchor is located and offset from the top
Land cover / use data product download
JS high frequency interview questions
Selenium + phantom JS crack sliding verification 2
MySQL_ 01_ Simple data retrieval
Secure credit
Install pyshp Library
[UDS unified diagnostic service] IV. typical diagnostic service (6) - input / output control unit (0x2F)
解决报错max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
Excel opens large CSV format data