mirror of
https://github.com/Sosokker/python-howto.git
synced 2025-12-19 22:14:04 +01:00
37 lines
1.5 KiB
Markdown
37 lines
1.5 KiB
Markdown
## Part 4: Combining Concurrency and Parallelism
|
|
|
|
In this final section, we'll discuss how to leverage both concurrency and parallelism to optimize Python applications.
|
|
### 1. Concurrency vs. Parallelism: When to Use Which?
|
|
|
|
Concurrency is suitable for I/O-bound tasks that involve waiting, such as network requests. Parallelism is ideal for CPU-bound tasks that can be divided into smaller units of work. In some cases, combining both approaches can provide the best performance.
|
|
|
|
### 2. Real-world Application: Web Scraping with Concurrency and Parallelism
|
|
|
|
As a practical example, let's consider web scraping. We can use asynchronous programming (`asyncio`) to handle multiple requests concurrently, and then use parallelism (`multiprocessing`) to process the scraped data in parallel. This combination can significantly speed up the entire scraping process.
|
|
|
|
```python
|
|
import asyncio
|
|
import aiohttp
|
|
import multiprocessing
|
|
|
|
async def fetch_url(session, url):
|
|
async with session.get(url) as response:
|
|
return await response.text()
|
|
|
|
def process_data(data):
|
|
# Process the scraped data
|
|
pass
|
|
|
|
async def main():
|
|
urls = ["url1", "url2", "url3"]
|
|
async with aiohttp.ClientSession() as session:
|
|
tasks = [fetch_url(session, url) for url in urls]
|
|
responses = await asyncio.gather(*tasks)
|
|
|
|
with multiprocessing.Pool() as pool:
|
|
processed_data = pool.map(process_data, responses)
|
|
|
|
print("Processed data:", processed_data)
|
|
|
|
asyncio.run(main())
|