嵌套函数中的局部变量

105

好的，请耐心等待，我知道它看起来会令人费解，但是请帮助我了解发生了什么。

from functools import partial

class Cage(object):
    def __init__(self, animal):
        self.animal = animal

def gotimes(do_the_petting):
    do_the_petting()

def get_petters():
    for animal in ['cow', 'dog', 'cat']:
        cage = Cage(animal)

        def pet_function():
            print "Mary pets the " + cage.animal + "."

        yield (animal, partial(gotimes, pet_function))

funs = list(get_petters())

for name, f in funs:
    print name + ":", 
    f()

给出：

cow: Mary pets the cat.
dog: Mary pets the cat.
cat: Mary pets the cat.

所以基本上，为什么我没有得到三种不同的动物？难道不是cage“打包”到嵌套函数的局部作用域中吗？如果不是，对嵌套函数的调用如何查找局部变量？

我知道遇到这些问题通常意味着一个人“做错了”，但是我想了解会发生什么。

— io
source

1

试试for animal in ['cat', 'dog', 'cow']...我确定有人会来解释这个-它是那些Python陷阱之一：）

— 乔恩·克莱门茨

114

嵌套函数在执行时（而不是在定义时）从父范围中查找变量。

编译函数主体，然后验证“自由”变量（未在函数本身中通过赋值定义），然后将其作为闭包单元绑定到函数，并且代码使用索引引用每个单元格。pet_function因此具有一个自由变量（cage），然后将其通过一个闭合单元引用，索引为0的闭合本身指向局部变量cage在get_petters功能。

当您实际调用该函数时，该闭包将用于在您调用该函数时查看cage周围作用域中的值。问题就在这里。在您调用函数时，该函数已经完成了对其结果的计算。将在在执行过程中的一些点局部变量分配各的，和字符串，但在功能的结束，包含了最后一个值。因此，当您调用每个动态返回的函数时，就会得到打印的值。get_petterscage'cow''dog''cat'cage'cat''cat'

解决方法是不依赖闭包。您可以改用部分函数，创建新的函数作用域或将变量绑定为关键字parameter的默认值。

部分函数示例，使用functools.partial()：

from functools import partial

def pet_function(cage=None):
    print "Mary pets the " + cage.animal + "."

yield (animal, partial(gotimes, partial(pet_function, cage=cage)))

创建一个新的范围示例：

def scoped_cage(cage=None):
    def pet_function():
        print "Mary pets the " + cage.animal + "."
    return pet_function

yield (animal, partial(gotimes, scoped_cage(cage)))

将变量绑定为关键字参数的默认值：

def pet_function(cage=cage):
    print "Mary pets the " + cage.animal + "."

yield (animal, partial(gotimes, pet_function))

无需scoped_cage在循环中定义函数，编译仅进行一次，而不是在循环的每次迭代中进行。

— 马亭皮特斯
source

1

今天我在工作脚本上用力撞墙在墙上三个小时。您的最后一点很重要，这也是我遇到此问题的主要原因。在我的代码中，我都有大量带有闭包的回调，但是在循环中尝试相同的技术才是让我成功的原因。

— DrEsperanto

12

我的理解是，在实际调用产生的pet_function时而不是之前，在父函数名称空间中查找了笼子。

所以当你这样做

funs = list(get_petters())

您生成3个函数，这些函数将找到最后创建的笼子。

如果您将最后一个循环替换为：

for name, f in get_petters():
    print name + ":", 
    f()

您实际上会得到：

cow: Mary pets the cow.
dog: Mary pets the dog.
cat: Mary pets the cat.

— 尼古拉斯·巴比（Nicolas Barbey）
source

6

这源于以下

for i in range(2): 
    pass

print(i)  # prints 1

迭代后，将的值i延迟存储为最终值。

作为生成器，该函数可以工作（即依次打印每个值），但是在转换为列表时，它将在生成器上运行，因此对cage（cage.animal）的所有调用都返回cats。

— 安迪·海登（Andy Hayden）
source

0

让我们简化问题。定义：

def get_petters():
    for animal in ['cow', 'dog', 'cat']:
        def pet_function():
            return "Mary pets the " + animal + "."

        yield (animal, pet_function)

然后，就像在问题中一样，我们得到：

>>> for name, f in list(get_petters()):
...     print(name + ":", f())

cow: Mary pets the cat.
dog: Mary pets the cat.
cat: Mary pets the cat.

但是，如果我们避免创建list()第一个：

>>> for name, f in get_petters():
...     print(name + ":", f())

cow: Mary pets the cow.
dog: Mary pets the dog.
cat: Mary pets the cat.

这是怎么回事？为什么这种微妙的差异会完全改变我们的结果？

如果我们看一下list(get_petters())，从不断变化的内存地址可以明显看出，我们确实产生了三种不同的功能：

>>> list(get_petters())

[('cow', <function get_petters.<locals>.pet_function at 0x7ff2b988d790>),
 ('dog', <function get_petters.<locals>.pet_function at 0x7ff2c18f51f0>),
 ('cat', <function get_petters.<locals>.pet_function at 0x7ff2c14a9f70>)]

但是，请看一下cell这些函数绑定到的：

>>> for _, f in list(get_petters()):
...     print(f(), f.__closure__)

Mary pets the cat. (<cell at 0x7ff2c112a9d0: str object at 0x7ff2c3f437f0>,)
Mary pets the cat. (<cell at 0x7ff2c112a9d0: str object at 0x7ff2c3f437f0>,)
Mary pets the cat. (<cell at 0x7ff2c112a9d0: str object at 0x7ff2c3f437f0>,)

>>> for _, f in get_petters():
...     print(f(), f.__closure__)

Mary pets the cow. (<cell at 0x7ff2b86b5d00: str object at 0x7ff2c1a95670>,)
Mary pets the dog. (<cell at 0x7ff2b86b5d00: str object at 0x7ff2c1a952f0>,)
Mary pets the cat. (<cell at 0x7ff2b86b5d00: str object at 0x7ff2c3f437f0>,)

对于这两个循环，cell对象在整个迭代过程中保持不变。但是，正如预期的那样，str它引用的具体内容在第二个循环中有所不同。该cell对象引用animal，在get_petters()调用时创建。但是，在生成器函数运行时animal更改str它所指的对象。

在第一个循环中，在每次迭代期间，我们都创建了所有fs，但是只有在生成器get_petters()完全用尽并且list已经创建a 函数之后，才调用它们。

在第二个循环中，在每次迭代期间，我们暂停get_petters()生成器并f在每次暂停后调用。因此，我们最终animal在生成器功能暂停的那一刻检索了值。

正如@Claudiu对类似问题的回答：

创建了三个单独的函数，但是每个函数都封闭了定义它们的环境-在这种情况下，是全局环境（如果将循环放在另一个函数内部，则为外部函数的环境）。不过，这确实是问题所在-在这种环境中，animal变量是突变的，并且所有的闭包都引用相同的animal。

[编者注：i已更改为animal。]

— Mateen Ulhaq
source