如何将字符串传递到subprocess.Popen（使用stdin参数）？

280

如果我执行以下操作：

import subprocess
from cStringIO import StringIO
subprocess.Popen(['grep','f'],stdout=subprocess.PIPE,stdin=StringIO('one\ntwo\nthree\nfour\nfive\nsix\n')).communicate()[0]

我得到：

Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/build/toolchain/mac32/python-2.4.3/lib/python2.4/subprocess.py", line 533, in __init__
    (p2cread, p2cwrite,
  File "/build/toolchain/mac32/python-2.4.3/lib/python2.4/subprocess.py", line 830, in _get_handles
    p2cread = stdin.fileno()
AttributeError: 'cStringIO.StringI' object has no attribute 'fileno'

显然，cStringIO.StringIO对象没有足够接近库中的子程序来适应subprocess.Popen。我该如何解决？

python subprocess stdin

— 达里尔·斯皮策
source

3

我并没有将其删除而引起争议，而是将其添加为评论...推荐阅读：Doug Hellmann的本周Python模块subprocess博客文章。

— Daryl Spitzer 2013年

3

该博客文章包含多个错误，例如，第一个代码示例：call(['ls', '-1'], shell=True) 不正确。我建议改为从子流程的标签说明中阅读常见问题。特别是当args是sequence时，为什么subprocess.Popen不起作用？解释为什么call(['ls', '-1'], shell=True)错了。我记得在博客文章下留下评论，但是由于某种原因我现在看不到它们。

— jfs

较新的版本，subprocess.run请参见stackoverflow.com/questions/48752152/…–

— Boris，

326

Popen.communicate() 说明文件：

请注意，如果要将数据发送到进程的stdin，则需要使用stdin = PIPE创建Popen对象。同样，要在结果元组中获得除None以外的任何内容，您还需要提供stdout = PIPE和/或stderr = PIPE。

替换os.popen *

    pipe = os.popen(cmd, 'w', bufsize)
    # ==>
    pipe = Popen(cmd, shell=True, bufsize=bufsize, stdin=PIPE).stdin

警告使用communication（）而不是stdin.write（），stdout.read（）或stderr.read（）以避免死锁，因为其他任何OS管道缓冲区都填满并阻塞了子进程。

因此，您的示例可以编写如下：

from subprocess import Popen, PIPE, STDOUT

p = Popen(['grep', 'f'], stdout=PIPE, stdin=PIPE, stderr=STDOUT)    
grep_stdout = p.communicate(input=b'one\ntwo\nthree\nfour\nfive\nsix\n')[0]
print(grep_stdout.decode())
# -> four
# -> five
# ->

在当前的Python 3版本中，您可以使用subprocess.run，将输入作为字符串传递给外部命令并获取其退出状态，并在一次调用中将输出作为字符串返回：

#!/usr/bin/env python3
from subprocess import run, PIPE

p = run(['grep', 'f'], stdout=PIPE,
        input='one\ntwo\nthree\nfour\nfive\nsix\n', encoding='ascii')
print(p.returncode)
# -> 0
print(p.stdout)
# -> four
# -> five
# ->

— f
source

3

我错过了那个警告。我很高兴问（即使我以为我有答案）。

— 达里尔·斯皮策

11

这不是一个好的解决方案。特别是，如果执行此操作，则无法异步处理p.stdout.readline输出，因为您必须等待整个stdout到达。这也是内存不足的。

— OTZ

7

@OTZ有什么更好的解决方案？

— 尼克T

11

@Nick T：“ 更好 ”取决于上下文。牛顿定律适用于它们适用的领域，但是您需要特别的相对论来设计GPS。请参阅python对子进程的非阻塞读取。

— jfs

9

但是请注意进行通信的注意事项：“如果数据大小很大或没有限制，请不要使用此方法”

— Owen

44

我想出了解决方法：

>>> p = subprocess.Popen(['grep','f'],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
>>> p.stdin.write(b'one\ntwo\nthree\nfour\nfive\nsix\n') #expects a bytes type object
>>> p.communicate()[0]
'four\nfive\n'
>>> p.stdin.close()

有更好的吗？

— 达里尔·斯皮策
source

25

@Moe：stdin.write()不鼓励使用，p.communicate()应该使用。看我的答案。

— jfs

11

根据子进程文档：警告-请使用communication（）而不是.stdin.write，.stdout.read或.stderr.read以避免死锁，这是由于任何其他OS管道缓冲区填满并阻塞了子进程造成的。

— 杰森·莫克

1

我认为，如果您确信自己的stdout / err永远不会填满（例如，它将存储到一个文件中，或者另一个线程正在占用它），并且您拥有无限制的数据量，那么这是个好方法被发送到标准输入。

— Lucretiel

1

特别是，这样做仍然可以确保关闭stdin，因此，如果子communicate进程是永远消耗输入的子进程，则它将关闭管道并允许进程正常结束。

— Lucretiel

@Lucretiel，如果该进程永远消耗stdin，那么大概它仍然可以永远写入stdout，因此我们需要全方位的完全不同的技术（不能read()使用它，communicate()即使没有参数也是如此）。

— 查尔斯·达菲

25

令我惊讶的是，没有人建议创建管道，这是将字符串传递给子流程的stdin的最简单方法：

read, write = os.pipe()
os.write(write, "stdin input here")
os.close(write)

subprocess.check_call(['your-command'], stdin=read)

— 格雷厄姆·克里斯滕森
source

2

在os和subprocess文档都同意，你应该更喜欢后者超过前者。这是一个遗留解决方案，具有（稍微不太简洁）的标准替代品。接受的答案引用了相关文档。

— 三胞胎

1

我不确定这是正确的，三胞胎。引用的文档说明了为什么很难使用该过程创建的管道，但是在此解决方案中，它创建了一个管道并将其传递。我相信，它避免了在过程开始后管理管道的潜在死锁问题。

— 格雷厄姆·克里斯滕森

不推荐使用os.popen，而是使用子流程

— hd1

2

-1：导致死锁，可能会丢失数据。子流程模块已经提供了此功能。使用它而不是重新实现它（实现一个大于OS管道缓冲区的值）的效果很差

— jfs

您应该得到最好的好人，谢谢您提供最简单，最聪明的解决方案

— Felipe Buccioni

21

如果您使用的是Python 3.4或更高版本，则有一个不错的解决方案。使用input参数代替stdin参数，该参数接受一个字节参数：

output = subprocess.check_output(
    ["sed", "s/foo/bar/"],
    input=b"foo",
)

这适用于check_output和run，但不call还是check_call出于某种原因。

— 弗利姆
source

5

@vidstige你是对的，这很奇怪。我会考虑将其作为Python错误进行归档，但我看不出有何理由check_output应有一个input论点，但是没有call。

— Flimm

2

这是Python 3.4+（在Python 3.6中使用它）的最佳答案。确实不起作用，check_call但适用于run。只要您也根据文档传递了编码参数，它也可以与input = string一起使用。

— Nikolaos Georgiou

13

我正在使用python3，发现您需要先对字符串进行编码，然后才能将其传递到stdin中：

p = Popen(['grep', 'f'], stdout=PIPE, stdin=PIPE, stderr=PIPE)
out, err = p.communicate(input='one\ntwo\nthree\nfour\nfive\nsix\n'.encode())
print(out)

— ed
source

5

您不需要专门对输入进行编码，它只需要一个类似字节的对象（例如b'something'）。它将同时返回err和out作为字节。如果您想避免这种情况，可以universal_newlines=True转到Popen。然后它将接受输入作为str并返回err / out作为str。

— 六

2

但要注意，universal_newlines=True也会将您的新行来匹配您的系统

— NACHT -恢复莫妮卡

1

如果您使用的是Python 3，请参阅我的答案以获得更便捷的解决方案。

— Flimm

12

显然，cStringIO.StringIO对象没有足够接近库中的子文件来适应子进程。

恐怕不是。管道是低级OS概念，因此绝对需要由OS级文件描述符表示的文件对象。您的解决方法是正确的。

— 丹·伦斯基
source

7

from subprocess import Popen, PIPE
from tempfile import SpooledTemporaryFile as tempfile
f = tempfile()
f.write('one\ntwo\nthree\nfour\nfive\nsix\n')
f.seek(0)
print Popen(['/bin/grep','f'],stdout=PIPE,stdin=f).stdout.read()
f.close()

— 迈克尔·瓦德尔
source

3

fyi，tempfile.SpooledTemporaryFile .__ doc__说：临时文件包装器，专门用于在超过特定大小或需要文件编号时从StringIO切换到实际文件。

— Doug F

5

请注意，Popen.communicate(input=s)如果s太大，可能会给您带来麻烦，因为显然父进程会在派生子进程之前对其进行缓冲，这意味着此时它需要“两倍多”的已用内存（至少根据“幕后”的解释）以及在此处找到的链接文档）。在我的特定情况下，s是一个生成器，它首先被完全扩展，然后才被写入，stdin因此在生成子代之前，父进程非常庞大，并且没有内存可以分叉它：

File "/opt/local/stow/python-2.7.2/lib/python2.7/subprocess.py", line 1130, in _execute_child self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory

— 亨利·沃顿勋爵
source

5

"""
Ex: Dialog (2-way) with a Popen()
"""

p = subprocess.Popen('Your Command Here',
                 stdout=subprocess.PIPE,
                 stderr=subprocess.STDOUT,
                 stdin=PIPE,
                 shell=True,
                 bufsize=0)
p.stdin.write('START\n')
out = p.stdout.readline()
while out:
  line = out
  line = line.rstrip("\n")

  if "WHATEVER1" in line:
      pr = 1
      p.stdin.write('DO 1\n')
      out = p.stdout.readline()
      continue

  if "WHATEVER2" in line:
      pr = 2
      p.stdin.write('DO 2\n')
      out = p.stdout.readline()
      continue
"""
..........
"""

out = p.stdout.readline()

p.wait()

— 露西恩·赫考（Lucien Hercaud）
source

4

因为shell=True通常没有充分的理由使用它，这是一个很普遍的问题，让我指出，在许多情况下，Popen(['cmd', 'with', 'args'])绝对比Popen('cmd with args', shell=True)让shell将命令和参数分解成令牌更好，但没有提供任何其他东西有用，同时增加了大量复杂性，因此也增加了攻击面。

— 2014年

2

p = Popen(['grep', 'f'], stdout=PIPE, stdin=PIPE, stderr=STDOUT)    
p.stdin.write('one\n')
time.sleep(0.5)
p.stdin.write('two\n')
time.sleep(0.5)
p.stdin.write('three\n')
time.sleep(0.5)
testresult = p.communicate()[0]
time.sleep(0.5)
print(testresult)

— 杜尚
source

1

在Python 3.7+上执行以下操作：

my_data = "whatever you want\nshould match this f"
subprocess.run(["grep", "f"], text=True, input=my_data)

并且您可能想要添加capture_output=True以获取以字符串形式运行命令的输出。

在旧版本的Python上，替换text=True为universal_newlines=True：

subprocess.run(["grep", "f"], universal_newlines=True, input=my_data)

— 鲍里斯
source