Answers:
在Python 3.2+中,stdlib concurrent.futures
模块向提供了更高级别的API threading
,包括将返回值或异常从工作线程传递回主线程:
import concurrent.futures
def foo(bar):
print('hello {}'.format(bar))
return 'foo'
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(foo, 'world!')
return_value = future.result()
print(return_value)
FWIW,该multiprocessing
模块为此类提供了一个不错的接口Pool
。而且,如果您要坚持使用线程而不是进程,则可以只使用multiprocessing.pool.ThreadPool
该类作为替代品。
def foo(bar, baz):
print 'hello {0}'.format(bar)
return 'foo' + baz
from multiprocessing.pool import ThreadPool
pool = ThreadPool(processes=1)
async_result = pool.apply_async(foo, ('world', 'foo')) # tuple of args for foo
# do some other stuff in the main process
return_val = async_result.get() # get the return value from your function.
multiprocess
,但它们与流程无关。
processes=1
如果您有更多线程,别忘了设置多个。
我见过的一种方法是将可变对象(例如列表或字典)与索引或某种其他标识符一起传递给线程的构造函数。然后,线程可以将其结果存储在该对象的专用插槽中。例如:
def foo(bar, result, index):
print 'hello {0}'.format(bar)
result[index] = "foo"
from threading import Thread
threads = [None] * 10
results = [None] * 10
for i in range(len(threads)):
threads[i] = Thread(target=foo, args=('world!', results, i))
threads[i].start()
# do some other stuff
for i in range(len(threads)):
threads[i].join()
print " ".join(results) # what sound does a metasyntactic locomotive make?
如果您确实想join()
返回被调用函数的返回值,则可以使用如下所示的Thread
子类来实现:
from threading import Thread
def foo(bar):
print 'hello {0}'.format(bar)
return "foo"
class ThreadWithReturnValue(Thread):
def __init__(self, group=None, target=None, name=None,
args=(), kwargs={}, Verbose=None):
Thread.__init__(self, group, target, name, args, kwargs, Verbose)
self._return = None
def run(self):
if self._Thread__target is not None:
self._return = self._Thread__target(*self._Thread__args,
**self._Thread__kwargs)
def join(self):
Thread.join(self)
return self._return
twrv = ThreadWithReturnValue(target=foo, args=('world!',))
twrv.start()
print twrv.join() # prints foo
由于名称修改,这有点麻烦,并且它访问特定于Thread
实现的“私有”数据结构...但是它可以工作。
对于python3
class ThreadWithReturnValue(Thread):
def __init__(self, group=None, target=None, name=None,
args=(), kwargs={}, Verbose=None):
Thread.__init__(self, group, target, name, args, kwargs)
self._return = None
def run(self):
print(type(self._target))
if self._target is not None:
self._return = self._target(*self._args,
**self._kwargs)
def join(self, *args):
Thread.join(self, *args)
return self._return
threading
,而不是要尝试的其他库,再加上池大小的限制会带来一个额外的潜在问题,这种情况在我的情况下就已发生。
TypeError: __init__() takes from 1 to 6 positional arguments but 7 were given
。有什么办法解决吗?
_Thread__target
件事的人。您将使任何试图将您的代码移植到python 3的人都讨厌您,直到他们确定您已完成的工作为止(因为使用的是2到3之间变化的未记录功能)。很好地记录您的代码。
Jake的答案很好,但是如果您不想使用线程池(您不知道需要多少线程,而是根据需要创建它们),那么内置的一种在线程之间传输信息的好方法队列类,因为它提供线程安全性。
我创建了以下装饰器,以使其与线程池类似:
def threaded(f, daemon=False):
import Queue
def wrapped_f(q, *args, **kwargs):
'''this function calls the decorated function and puts the
result in a queue'''
ret = f(*args, **kwargs)
q.put(ret)
def wrap(*args, **kwargs):
'''this is the function returned from the decorator. It fires off
wrapped_f in a new thread and returns the thread object with
the result queue attached'''
q = Queue.Queue()
t = threading.Thread(target=wrapped_f, args=(q,)+args, kwargs=kwargs)
t.daemon = daemon
t.start()
t.result_queue = q
return t
return wrap
然后,将其用作:
@threaded
def long_task(x):
import time
x = x + 5
time.sleep(5)
return x
# does not block, returns Thread object
y = long_task(10)
print y
# this blocks, waiting for the result
result = y.result_queue.get()
print result
装饰函数每次调用时都会创建一个新线程,并返回一个Thread对象,该对象包含将接收结果的队列。
更新
自从我发布这个答案已经有一段时间了,但是它仍然得到视图,所以我想我将对其进行更新以反映我在较新版本的Python中执行此操作的方式:
concurrent.futures
模块中添加了Python 3.2,该模块为并行任务提供了高级接口。它提供ThreadPoolExecutor
和ProcessPoolExecutor
,因此您可以使用具有相同api的线程或进程池。
此API的一个好处是将任务提交给Executor
返回值Future
return会对象,该对象将以您提交的可调用对象的返回值完成。
这使得queue
不需要附加对象,从而大大简化了装饰器:
_DEFAULT_POOL = ThreadPoolExecutor()
def threadpool(f, executor=None):
@wraps(f)
def wrap(*args, **kwargs):
return (executor or _DEFAULT_POOL).submit(f, *args, **kwargs)
return wrap
这将使用默认模块如果未传入,线程池执行程序。
用法与之前非常相似:
@threadpool
def long_task(x):
import time
x = x + 5
time.sleep(5)
return x
# does not block, returns Future object
y = long_task(10)
print y
# this blocks, waiting for the result
result = y.result()
print result
如果您使用的是Python 3.4+,则使用此方法(通常是Future对象)的一个非常不错的功能是可以包装返回的future并将其转换为asyncio.Future
with asyncio.wrap_future
。这使得它很容易与协程一起工作:
result = await asyncio.wrap_future(long_task(10))
如果不需要访问基础concurrent.Future
对象,则可以在包装器中包含自动换行:
_DEFAULT_POOL = ThreadPoolExecutor()
def threadpool(f, executor=None):
@wraps(f)
def wrap(*args, **kwargs):
return asyncio.wrap_future((executor or _DEFAULT_POOL).submit(f, *args, **kwargs))
return wrap
然后,每当需要将cpu密集型代码或阻塞代码从事件循环线程中推出时,都可以将其放入经过修饰的函数中:
@threadpool
def some_long_calculation():
...
# this will suspend while the function is executed on a threadpool
result = await some_long_calculation()
AttributeError: 'module' object has no attribute 'Lock'
这似乎是从生产线发出的y = long_task(10)
……想法?
另一个不需要更改现有代码的解决方案:
import Queue
from threading import Thread
def foo(bar):
print 'hello {0}'.format(bar)
return 'foo'
que = Queue.Queue()
t = Thread(target=lambda q, arg1: q.put(foo(arg1)), args=(que, 'world!'))
t.start()
t.join()
result = que.get()
print result
还可以轻松地将其调整为多线程环境:
import Queue
from threading import Thread
def foo(bar):
print 'hello {0}'.format(bar)
return 'foo'
que = Queue.Queue()
threads_list = list()
t = Thread(target=lambda q, arg1: q.put(foo(arg1)), args=(que, 'world!'))
t.start()
threads_list.append(t)
# Add more threads here
...
threads_list.append(t2)
...
threads_list.append(t3)
...
# Join all the threads
for t in threads_list:
t.join()
# Check thread's return value
while not que.empty():
result = que.get()
print result
from queue import Queue
。
Parris / kindall的答案 join
/ return
移植到Python 3 的答案:
from threading import Thread
def foo(bar):
print('hello {0}'.format(bar))
return "foo"
class ThreadWithReturnValue(Thread):
def __init__(self, group=None, target=None, name=None, args=(), kwargs=None, *, daemon=None):
Thread.__init__(self, group, target, name, args, kwargs, daemon=daemon)
self._return = None
def run(self):
if self._target is not None:
self._return = self._target(*self._args, **self._kwargs)
def join(self):
Thread.join(self)
return self._return
twrv = ThreadWithReturnValue(target=foo, args=('world!',))
twrv.start()
print(twrv.join()) # prints foo
请注意,Thread
该类在Python 3中的实现方式有所不同。
我偷了kindall的答案并整理了一下。
关键部分是将* args和** kwargs添加到join()中以处理超时
class threadWithReturn(Thread):
def __init__(self, *args, **kwargs):
super(threadWithReturn, self).__init__(*args, **kwargs)
self._return = None
def run(self):
if self._Thread__target is not None:
self._return = self._Thread__target(*self._Thread__args, **self._Thread__kwargs)
def join(self, *args, **kwargs):
super(threadWithReturn, self).join(*args, **kwargs)
return self._return
下面的更新的答案
这是我最受欢迎的答案,因此我决定使用将同时在py2和py3上运行的代码进行更新。
另外,我看到这个问题的许多答案表明对Thread.join()缺乏理解。有些人完全无法处理timeout
arg。但是,当您拥有(1)可以返回的目标函数None
并且(2)您还传递了(timeout
arg给join()。请参阅“测试4”以了解这种极端情况。
与py2和py3一起使用的ThreadWithReturn类:
import sys
from threading import Thread
from builtins import super # https://stackoverflow.com/a/30159479
if sys.version_info >= (3, 0):
_thread_target_key = '_target'
_thread_args_key = '_args'
_thread_kwargs_key = '_kwargs'
else:
_thread_target_key = '_Thread__target'
_thread_args_key = '_Thread__args'
_thread_kwargs_key = '_Thread__kwargs'
class ThreadWithReturn(Thread):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._return = None
def run(self):
target = getattr(self, _thread_target_key)
if not target is None:
self._return = target(
*getattr(self, _thread_args_key),
**getattr(self, _thread_kwargs_key)
)
def join(self, *args, **kwargs):
super().join(*args, **kwargs)
return self._return
一些示例测试如下所示:
import time, random
# TEST TARGET FUNCTION
def giveMe(arg, seconds=None):
if not seconds is None:
time.sleep(seconds)
return arg
# TEST 1
my_thread = ThreadWithReturn(target=giveMe, args=('stringy',))
my_thread.start()
returned = my_thread.join()
# (returned == 'stringy')
# TEST 2
my_thread = ThreadWithReturn(target=giveMe, args=(None,))
my_thread.start()
returned = my_thread.join()
# (returned is None)
# TEST 3
my_thread = ThreadWithReturn(target=giveMe, args=('stringy',), kwargs={'seconds': 5})
my_thread.start()
returned = my_thread.join(timeout=2)
# (returned is None) # because join() timed out before giveMe() finished
# TEST 4
my_thread = ThreadWithReturn(target=giveMe, args=(None,), kwargs={'seconds': 5})
my_thread.start()
returned = my_thread.join(timeout=random.randint(1, 10))
您能否确定在测试4中可能遇到的特殊情况?
问题在于,我们希望GiveMe()返回None(请参见测试2),但是我们也希望join()如果超时则返回None。
returned is None
意味着:
(1)这就是GiveMe()返回的结果,或者
(2)join()超时
这个例子很简单,因为我们知道GiveMe()将始终返回None。但是在实际情况下(目标可能合法返回None或其他),我们希望显式检查发生了什么。
以下是解决这种情况的方法:
# TEST 4
my_thread = ThreadWithReturn(target=giveMe, args=(None,), kwargs={'seconds': 5})
my_thread.start()
returned = my_thread.join(timeout=random.randint(1, 10))
if my_thread.isAlive():
# returned is None because join() timed out
# this also means that giveMe() is still running in the background
pass
# handle this based on your app's logic
else:
# join() is finished, and so is giveMe()
# BUT we could also be in a race condition, so we need to update returned, just in case
returned = my_thread.join()
target
,args
和kwargs
参数初始化为你的类的成员变量。
使用队列:
import threading, queue
def calc_square(num, out_queue1):
l = []
for x in num:
l.append(x*x)
out_queue1.put(l)
arr = [1,2,3,4,5,6,7,8,9,10]
out_queue1=queue.Queue()
t1=threading.Thread(target=calc_square, args=(arr,out_queue1))
t1.start()
t1.join()
print (out_queue1.get())
out_queue1
则需要循环out_queue1.get()
并捕获Queue.Empty异常:ret = [] ; try: ; while True; ret.append(out_queue1.get(block=False)) ; except Queue.Empty: ; pass
。分号模拟换行符。
我对这个问题的解决方案是将函数和线程包装在一个类中。不需要使用池,队列或c类型变量传递。这也是非阻塞的。您改为查看状态。请参阅代码末尾有关如何使用它的示例。
import threading
class ThreadWorker():
'''
The basic idea is given a function create an object.
The object can then run the function in a thread.
It provides a wrapper to start it,check its status,and get data out the function.
'''
def __init__(self,func):
self.thread = None
self.data = None
self.func = self.save_data(func)
def save_data(self,func):
'''modify function to save its returned data'''
def new_func(*args, **kwargs):
self.data=func(*args, **kwargs)
return new_func
def start(self,params):
self.data = None
if self.thread is not None:
if self.thread.isAlive():
return 'running' #could raise exception here
#unless thread exists and is alive start or restart it
self.thread = threading.Thread(target=self.func,args=params)
self.thread.start()
return 'started'
def status(self):
if self.thread is None:
return 'not_started'
else:
if self.thread.isAlive():
return 'running'
else:
return 'finished'
def get_results(self):
if self.thread is None:
return 'not_started' #could return exception
else:
if self.thread.isAlive():
return 'running'
else:
return self.data
def add(x,y):
return x +y
add_worker = ThreadWorker(add)
print add_worker.start((1,2,))
print add_worker.status()
print add_worker.get_results()
考虑到@iman对@JakeBiesinger答案的评论,我将其重新组成为具有多个线程:
from multiprocessing.pool import ThreadPool
def foo(bar, baz):
print 'hello {0}'.format(bar)
return 'foo' + baz
numOfThreads = 3
results = []
pool = ThreadPool(numOfThreads)
for i in range(0, numOfThreads):
results.append(pool.apply_async(foo, ('world', 'foo'))) # tuple of args for foo)
# do some other stuff in the main process
# ...
# ...
results = [r.get() for r in results]
print results
pool.close()
pool.join()
干杯,
伙计
您可以在线程函数的作用域之上定义一个可变变量,并将结果添加到该变量中。(我也将代码修改为与python3兼容)
returns = {}
def foo(bar):
print('hello {0}'.format(bar))
returns[bar] = 'foo'
from threading import Thread
t = Thread(target=foo, args=('world!',))
t.start()
t.join()
print(returns)
这返回 {'world!': 'foo'}
如果使用函数输入作为结果字典的键,则保证每个唯一的输入都会在结果中给出一个条目
我正在使用此包装器,该包装器可以轻松地打开任何函数以在其中运行Thread
-照顾其返回值或异常。它不会增加Queue
开销。
def threading_func(f):
"""Decorator for running a function in a thread and handling its return
value or exception"""
def start(*args, **kw):
def run():
try:
th.ret = f(*args, **kw)
except:
th.exc = sys.exc_info()
def get(timeout=None):
th.join(timeout)
if th.exc:
raise th.exc[0], th.exc[1], th.exc[2] # py2
##raise th.exc[1] #py3
return th.ret
th = threading.Thread(None, run)
th.exc = None
th.get = get
th.start()
return th
return start
def f(x):
return 2.5 * x
th = threading_func(f)(4)
print("still running?:", th.is_alive())
print("result:", th.get(timeout=1.0))
@threading_func
def th_mul(a, b):
return a * b
th = th_mul("text", 2.5)
try:
print(th.get())
except TypeError:
print("exception thrown ok.")
threading
模块注意事项线程函数的舒适的返回值和异常处理是“ Pythonic”的常见需求,并且确实应该已经由threading
模块提供-可能直接在标准Thread
类中提供。ThreadPool
对于简单的任务来说有太多的开销-3个管理线程,很多官僚作风。不幸Thread
的是,其布局最初是从Java复制的-例如,您仍然可以从仍然无效的1st(!)构造函数参数中看到该布局group
。
将目标定义为
1)接受参数q
2)将任何语句替换return foo
为q.put(foo); return
所以一个功能
def func(a):
ans = a * a
return ans
会成为
def func(a, q):
ans = a * a
q.put(ans)
return
然后您将照此进行
from Queue import Queue
from threading import Thread
ans_q = Queue()
arg_tups = [(i, ans_q) for i in xrange(10)]
threads = [Thread(target=func, args=arg_tup) for arg_tup in arg_tups]
_ = [t.start() for t in threads]
_ = [t.join() for t in threads]
results = [q.get() for _ in xrange(len(threads))]
而且,您可以使用函数装饰器/包装器来制作它,这样就可以target
不修改而使用现有功能,而是遵循此基本方案。
results = [ans_q.get() for _ in xrange(len(threads))]
如上所述,多处理池比基本线程慢得多。使用一些答案中提出的队列是一种非常有效的选择。我将它与字典配合使用,以便能够运行许多小线程并通过将它们与字典结合来调理多个答案:
#!/usr/bin/env python3
import threading
# use Queue for python2
import queue
import random
LETTERS = 'abcdefghijklmnopqrstuvwxyz'
LETTERS = [ x for x in LETTERS ]
NUMBERS = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
def randoms(k, q):
result = dict()
result['letter'] = random.choice(LETTERS)
result['number'] = random.choice(NUMBERS)
q.put({k: result})
threads = list()
q = queue.Queue()
results = dict()
for name in ('alpha', 'oscar', 'yankee',):
threads.append( threading.Thread(target=randoms, args=(name, q)) )
threads[-1].start()
_ = [ t.join() for t in threads ]
while not q.empty():
results.update(q.get())
print(results)
GuySoft的想法很棒,但是我认为对象不一定必须继承自Thread,并且可以从接口中删除start():
from threading import Thread
import queue
class ThreadWithReturnValue(object):
def __init__(self, target=None, args=(), **kwargs):
self._que = queue.Queue()
self._t = Thread(target=lambda q,arg1,kwargs1: q.put(target(*arg1, **kwargs1)) ,
args=(self._que, args, kwargs), )
self._t.start()
def join(self):
self._t.join()
return self._que.get()
def foo(bar):
print('hello {0}'.format(bar))
return "foo"
twrv = ThreadWithReturnValue(target=foo, args=('world!',))
print(twrv.join()) # prints foo
一种常见的解决方案是foo
使用类似这样的装饰器包装函数
result = queue.Queue()
def task_wrapper(*args):
result.put(target(*args))
然后整个代码可能像这样
result = queue.Queue()
def task_wrapper(*args):
result.put(target(*args))
threads = [threading.Thread(target=task_wrapper, args=args) for args in args_list]
for t in threads:
t.start()
while(True):
if(len(threading.enumerate()) < max_num):
break
for t in threads:
t.join()
return result
一个重要的问题是返回值可能是无序的。(实际上,return value
不一定将保存到queue
,因为您可以选择任意线程安全的数据结构)
为什么不只使用全局变量?
import threading
class myThread(threading.Thread):
def __init__(self, ind, lock):
threading.Thread.__init__(self)
self.ind = ind
self.lock = lock
def run(self):
global results
with self.lock:
results.append(self.ind)
results = []
lock = threading.Lock()
threads = [myThread(x, lock) for x in range(1, 4)]
for t in threads:
t.start()
for t in threads:
t.join()
print(results)
class ThreadWithReturnValue(Thread):
def __init__(self, group=None, target=None, name=None,
args=(), kwargs={}, *, daemon=None):
Thread.__init__(self, group, target, name, args, kwargs, daemon)
self._return = None
def run(self):
try:
if self._target:
self._return = self._target(*self._args, **self._kwargs)
finally:
del self._target, self._args, self._kwargs
def join(self,timeout=None):
Thread.join(self,timeout)
return self._return
如果仅要从函数调用中验证True或False,我发现一个更简单的解决方案是更新全局列表。
import threading
lists = {"A":"True", "B":"True"}
def myfunc(name: str, mylist):
for i in mylist:
if i == 31:
lists[name] = "False"
return False
else:
print("name {} : {}".format(name, i))
t1 = threading.Thread(target=myfunc, args=("A", [1, 2, 3, 4, 5, 6], ))
t2 = threading.Thread(target=myfunc, args=("B", [11, 21, 31, 41, 51, 61], ))
t1.start()
t2.start()
t1.join()
t2.join()
for value in lists.values():
if value == False:
# Something is suspicious
# Take necessary action
如果您想查找任何一个线程是否返回了错误的状态以采取必要的操作,这将对您有所帮助。
futures = [executor.submit(foo, param) for param in param_list]
订单将保持不变,退出with
将允许结果收集。[f.result() for f in futures]