Skip to content

Commit 5357f55

Browse files
author
Benjamin Moody
committed
dl_files, dl_database: avoid using multiple processes
The standard multiprocessing module is used to distribute a task to multiple processes, which is useful when doing heavy computation due to the limitations of CPython; however, making this work is dependent on the ability to fork processes or else to kludgily emulate forking on systems that don't support it. In particular, it tends to cause problems on Windows unless you are very scrupulous about how you write your program. Therefore, as a rule, the multiprocessing module shouldn't be used by general-purpose libraries, and should only be invoked by application programmers themselves (who are in a position to guarantee that imports have no side effects, the main script uses 'if __name__ == "__main__"', etc.) However, downloading a file isn't a CPU-bound task, it's an I/O-bound task, and therefore for this purpose, parallel threads should work as well or even better than parallel processes. The multiprocessing.dummy module provides the same API as the multiprocessing module, but uses threads instead of processes, so it should be safe to use in a general-purpose library.
1 parent 84cbefb commit 5357f55

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

wfdb/io/download.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import json
2-
import multiprocessing
2+
import multiprocessing.dummy
33
import os
44
import posixpath
55

@@ -566,7 +566,7 @@ def dl_files(db, dl_dir, files, keep_subdirs=True, overwrite=False):
566566
print("Downloading files...")
567567
# Create multiple processes to download files.
568568
# Limit to 2 connections to avoid overloading the server
569-
pool = multiprocessing.Pool(processes=2)
569+
pool = multiprocessing.dummy.Pool(processes=2)
570570
pool.map(dl_pn_file, dl_inputs)
571571
print("Finished downloading files")
572572

wfdb/io/record.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
import datetime
2-
import multiprocessing
2+
import multiprocessing.dummy
33
import posixpath
44
import re
55

@@ -3090,7 +3090,7 @@ def dl_database(
30903090
print("Downloading files...")
30913091
# Create multiple processes to download files.
30923092
# Limit to 2 connections to avoid overloading the server
3093-
pool = multiprocessing.Pool(processes=2)
3093+
pool = multiprocessing.dummy.Pool(processes=2)
30943094
pool.map(download.dl_pn_file, dl_inputs)
30953095
print("Finished downloading files")
30963096

0 commit comments

Comments
 (0)