Python中的线程本地存储

Question 1

如何在Python中使用线程本地存储？

有关

Python中的“线程本地存储”是什么，为什么需要它？-共享变量时，该线程似乎更加关注。
确定特定功能是否在Python堆栈中的有效方法-Alex Martelli提供了一个不错的解决方案

Question 2

例如，如果您有一个线程工作池，并且每个线程都需要访问其自己的资源（例如网络或数据库连接），则线程本地存储很有用。请注意，该threading模块使用常规的线程概念（可以访问进程全局数据），但是由于全局解释器锁定，它们并不是太有用。不同的multiprocessing模块会为每个模块创建一个新的子流程，因此任何全局变量都将是线程局部的。

穿线模块

这是一个简单的示例：

import threading
from threading import current_thread

threadLocal = threading.local()

def hi():
    initialized = getattr(threadLocal, 'initialized', None)
    if initialized is None:
        print("Nice to meet you", current_thread().name)
        threadLocal.initialized = True
    else:
        print("Welcome back", current_thread().name)

hi(); hi()

这将打印出：

Nice to meet you MainThread
Welcome back MainThread

一个很容易被忽略的重要事情：一个threading.local()对象只需要创建一次，而不是每个线程创建一次，也不是每个函数调用创建一次。的global或class水平的理想地点。

这就是为什么：threading.local()每次调用它时都会实际上创建一个新实例（就像任何工厂或类调用一样），因此threading.local()多次调用会不断覆盖原始对象，这很可能不是您想要的。当任何线程访问现有threadLocal变量（或任何被调用的变量）时，它将获得该变量的私有视图。

这将无法正常工作：

import threading
from threading import current_thread

def wont_work():
    threadLocal = threading.local() #oops, this creates a new dict each time!
    initialized = getattr(threadLocal, 'initialized', None)
    if initialized is None:
        print("First time for", current_thread().name)
        threadLocal.initialized = True
    else:
        print("Welcome back", current_thread().name)

wont_work(); wont_work()

将产生以下输出：

First time for MainThread
First time for MainThread

多处理模块

因为multiprocessing模块为每个线程创建一个新进程，所以所有全局变量都是线程局部的。

考虑以下示例，其中processed计数器是线程本地存储的示例：

from multiprocessing import Pool
from random import random
from time import sleep
import os

processed=0

def f(x):
    sleep(random())
    global processed
    processed += 1
    print("Processed by %s: %s" % (os.getpid(), processed))
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)
    print(pool.map(f, range(10)))

它将输出如下内容：

Processed by 7636: 1
Processed by 9144: 1
Processed by 5252: 1
Processed by 7636: 2
Processed by 6248: 1
Processed by 5252: 2
Processed by 6248: 2
Processed by 9144: 2
Processed by 7636: 3
Processed by 5252: 3
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

...当然，线程ID以及每个线程ID和每个命令的计数会因运行而异。

Question 3

可以将线程本地存储简单地视为一个名称空间（通过属性符号访问值）。不同之处在于，每个线程透明地获得自己的一组属性/值，因此一个线程看不到另一个线程的值。

就像普通对象一样，您可以threading.local在代码中创建多个实例。它们可以是局部变量，类或实例成员或全局变量。每个都是独立的命名空间。

这是一个简单的例子：

import threading

class Worker(threading.Thread):
    ns = threading.local()
    def run(self):
        self.ns.val = 0
        for i in range(5):
            self.ns.val += 1
            print("Thread:", self.name, "value:", self.ns.val)

w1 = Worker()
w2 = Worker()
w1.start()
w2.start()
w1.join()
w2.join()

输出：

Thread: Thread-1 value: 1
Thread: Thread-2 value: 1
Thread: Thread-1 value: 2
Thread: Thread-2 value: 2
Thread: Thread-1 value: 3
Thread: Thread-2 value: 3
Thread: Thread-1 value: 4
Thread: Thread-2 value: 4
Thread: Thread-1 value: 5
Thread: Thread-2 value: 5

注意每个线程如何维护自己的计数器，即使该ns属性是类成员（并因此在线程之间共享）也是如此。

相同的示例可以使用实例变量或局部变量，但是不会显示太多，因为那时没有共享（字典也可以工作）。在某些情况下，您需要将线程局部存储作为实例变量或局部变量，但是它们往往相对较少（并且非常微妙）。

Question 4

正如问题中指出的那样，亚历克斯·马特利（Alex Martelli）在此提供了一个解决方案。此函数使我们可以使用工厂函数为每个线程生成默认值。

#Code originally posted by Alex Martelli
#Modified to use standard Python variable name conventions
import threading
threadlocal = threading.local()    

def threadlocal_var(varname, factory, *args, **kwargs):
  v = getattr(threadlocal, varname, None)
  if v is None:
    v = factory(*args, **kwargs)
    setattr(threadlocal, varname, v)
  return v

Question 5

也可以写

import threading
mydata = threading.local()
mydata.x = 1

mydata.x将仅存在于当前线程中

Question 6

我在模块/文件之间进行线程本地存储的方式。以下内容已在Python 3.5中进行了测试-

import threading
from threading import current_thread

# fileA.py 
def functionOne:
    thread = Thread(target = fileB.functionTwo)
    thread.start()

#fileB.py
def functionTwo():
    currentThread = threading.current_thread()
    dictionary = currentThread.__dict__
    dictionary["localVar1"] = "store here"   #Thread local Storage
    fileC.function3()

#fileC.py
def function3():
    currentThread = threading.current_thread()
    dictionary = currentThread.__dict__
    print (dictionary["localVar1"])           #Access thread local Storage

在fileA中，我启动一个在另一个模块/文件中具有目标功能的线程。

在fileB中，我在该线程中设置了想要的局部变量。

在fileC中，我访问当前线程的线程局部变量。

此外，只需打印'dictionary'变量，这样您就可以看到可用的默认值，例如kwargs，args等。