In today's world of computing, it's essential to have the ability to execute multiple tasks simultaneously to achieve optimal performance and speed. One way to achieve this in Python is by using a concurrent thread pool executor. In this article, we'll explore what a thread pool executor is, how it works, and how to use it in Python for using maximum CPU performance for different kind of tasks.

What is a thread pool executor?

A thread pool executor is a way to manage a group of threads, which can be used to execute multiple tasks simultaneously. The basic idea behind a thread pool is to have a group of threads available for use, instead of creating a new thread for each task. The threads in the pool can be reused to execute multiple tasks, which is more efficient than creating a new thread for each task.

A thread pool executor provides an easy-to-use interface for managing a pool of threads. It abstracts away the complexities of managing threads, so you can focus on writing your code.

How does a thread pool executor work?

A thread pool executor works by creating a pool of worker threads. When a task is submitted to the thread pool executor, it assigns the task to one of the worker threads in the pool. If all the worker threads are busy, the task is placed in a queue, and it waits for a worker thread to become available.

When a worker thread completes a task, it becomes available to take on another task. The thread pool executor assigns the next task in the queue to the available worker thread.

How to use a thread pool executor in Python

Python provides a concurrent.futures module that contains the ThreadPoolExecutor class, which can be used to create and manage a thread pool executor. Here's an example of how to use a thread pool executor to download multiple files simultaneously:

import concurrent.futures
import requests

# URLs to download
urls = [
    'https://example.com/file1',
    'https://example.com/file2',
    'https://example.com/file3',
    'https://example.com/file4',
    'https://example.com/file5'
]

# Function to download a file
def download_file(url):
    response = requests.get(url)
    filename = url.split('/')[-1]
    with open(filename, 'wb') as f:
        f.write(response.content)
    return filename

# Create a thread pool executor with 5 worker threads
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    result = executor.map(download_file, urls)

In this example, we first define a list of URLs to download. We then define a function, download_file, that takes a URL and downloads the file at that URL.

Next, we create a ThreadPoolExecutor with a maximum of 5 worker threads. We then submit the download_file function to the thread pool executor for each URL in the urls list using executor.map function. It automatically iterate on each value in list and submit to function and which in return finally returns a Future.map object which we can iterate or convert to list using list(result). It contains each filename that we have returned for each function call.

['file1', 'file2', 'file3', 'file4', 'file5']

Progress of tasks

We can also get progress of each task being executed using tqdm or any other related libraries. You can install tqdm using pip in python like pip install tqdm.

from tqdm import tqdm
# It shows total and number of completed tasks with avg time taken for task
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    resp = list(tqdm(executor.map(download_file, urls), total=len(urls)))

Get each task result

We can also submit each url individually and then show results when there is response. Check this example for implmentation:

# Create a thread pool executor with 5 worker threads
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Submit the download_file function to the thread pool executor
    # for each URL in the urls list
    futures = [executor.submit(download_file, url) for url in urls]

    # Wait for all tasks to complete
    for future in concurrent.futures.as_completed(futures):
        filename = future.result()
        print(f'{filename} downloaded')

The submit function returns a Future object, which represents the result of the function call.

We then wait for all the tasks to complete by calling concurrent.futures.as_completed on the list of Future objects. This function returns an iterator that yields the Future objects as they complete. We then print a message indicating that each file has been downloaded.

Conclusion
In conclusion, a thread pool executor is an essential tool for executing multiple tasks simultaneously in Python. It simplifies the management of threads and provides an easy-to-use interface for managing a pool of worker threads. For more details on this, check official python documentation.

https://docs.python.org/3/library/concurrent.futures.html