mirror of
https://github.com/Sosokker/python-howto.git
synced 2025-12-18 13:34:05 +01:00
1.5 KiB
1.5 KiB
Part 4: Combining Concurrency and Parallelism
In this final section, we'll discuss how to leverage both concurrency and parallelism to optimize Python applications.
1. Concurrency vs. Parallelism: When to Use Which?
Concurrency is suitable for I/O-bound tasks that involve waiting, such as network requests. Parallelism is ideal for CPU-bound tasks that can be divided into smaller units of work. In some cases, combining both approaches can provide the best performance.
2. Real-world Application: Web Scraping with Concurrency and Parallelism
As a practical example, let's consider web scraping. We can use asynchronous programming (asyncio) to handle multiple requests concurrently, and then use parallelism (multiprocessing) to process the scraped data in parallel. This combination can significantly speed up the entire scraping process.
import asyncio
import aiohttp
import multiprocessing
async def fetch_url(session, url):
async with session.get(url) as response:
return await response.text()
def process_data(data):
# Process the scraped data
pass
async def main():
urls = ["url1", "url2", "url3"]
async with aiohttp.ClientSession() as session:
tasks = [fetch_url(session, url) for url in urls]
responses = await asyncio.gather(*tasks)
with multiprocessing.Pool() as pool:
processed_data = pool.map(process_data, responses)
print("Processed data:", processed_data)
asyncio.run(main())