在Celery中检索队列中的任务列表

147

如何检索队列中尚未处理的任务列表？

python celery

— 布拉德利·艾尔斯
source

1

RabbitMQ，但我想在Python中检索此列表。

— bradley.ayers 2011年

174

编辑：查看其他答案以获得队列中的任务列表。

您应该在这里查看：芹菜指南-检查工人

基本上是这样的：

from celery.app.control import Inspect

# Inspect all nodes.
i = Inspect()

# Show the items that have an ETA or are scheduled for later processing
i.scheduled()

# Show tasks that are currently active.
i.active()

# Show tasks that have been claimed by workers
i.reserved()

根据您想要的

— Semarj
source

9

我尝试过，但是确实很慢（例如1秒）。我在龙卷风应用程序中同步使用它来监视进度，因此它必须快速。

— JulienFr

41

这不会返回队列中尚未处理的任务列表。

— Ed J

9

使用i.reserved()获得的排队任务列表。

— 香蕉2014年

4

有没有人经历过i.reserved（）没有活动任务的准确列表？我正在运行的任务没有显示在列表中。我正在django-celery == 3.1.10

— Seperman 2014年

6

指定工人时，我必须使用列表作为参数：inspect(['celery@Flatty'])。大大提高了速度inspect()。

— Adversus

42

如果使用的是rabbitMQ，请在终端中使用：

sudo rabbitmqctl list_queues

它将打印带有待处理任务数量的队列列表。例如：

Listing queues ...
0b27d8c59fba4974893ec22d478a7093    0
0e0a2da9828a48bc86fe993b210d984f    0
10@torob2.celery.pidbox 0
11926b79e30a4f0a9d95df61b6f402f7    0
15c036ad25884b82839495fb29bd6395    1
celerey_mail_worker@torob2.celery.pidbox    0
celery  166
celeryev.795ec5bb-a919-46a8-80c6-5d91d2fcf2aa   0
celeryev.faa4da32-a225-4f6c-be3b-d8814856d1b6   0

右列中的数字是队列中的任务数。在上面，芹菜队列有166个待处理任务。

— 阿里
source

1

当我具有sudo特权时，我对此很熟悉，但是我希望没有特权的系统用户可以检查-有什么建议吗？

— Sage圣人

此外grep -e "^celery\s" | cut -f2，166如果您想稍后处理该数字，则可以通过管道将其提取出来，例如统计数据。

— jamesc

21

如果您不使用优先任务，那么使用Redis 实际上非常简单。获取任务计数：

redis-cli -h HOST -p PORT -n DATABASE_NUMBER llen QUEUE_NAME

但是，优先任务在redis中使用了不同的键，因此整个过程稍微复杂一些。整个情况是您需要查询redis的每个任务优先级。在python中（以及Flower项目中），它看起来像：

PRIORITY_SEP = '\x06\x16'
DEFAULT_PRIORITY_STEPS = [0, 3, 6, 9]


def make_queue_name_for_pri(queue, pri):
    """Make a queue name for redis

    Celery uses PRIORITY_SEP to separate different priorities of tasks into
    different queues in Redis. Each queue-priority combination becomes a key in
    redis with names like:

     - batch1\x06\x163 <-- P3 queue named batch1

    There's more information about this in Github, but it doesn't look like it 
    will change any time soon:

      - https://github.com/celery/kombu/issues/422

    In that ticket the code below, from the Flower project, is referenced:

      - https://github.com/mher/flower/blob/master/flower/utils/broker.py#L135

    :param queue: The name of the queue to make a name for.
    :param pri: The priority to make a name with.
    :return: A name for the queue-priority pair.
    """
    if pri not in DEFAULT_PRIORITY_STEPS:
        raise ValueError('Priority not in priority steps')
    return '{0}{1}{2}'.format(*((queue, PRIORITY_SEP, pri) if pri else
                                (queue, '', '')))


def get_queue_length(queue_name='celery'):
    """Get the number of tasks in a celery queue.

    :param queue_name: The name of the queue you want to inspect.
    :return: the number of items in the queue.
    """
    priority_names = [make_queue_name_for_pri(queue_name, pri) for pri in
                      DEFAULT_PRIORITY_STEPS]
    r = redis.StrictRedis(
        host=settings.REDIS_HOST,
        port=settings.REDIS_PORT,
        db=settings.REDIS_DATABASES['CELERY'],
    )
    return sum([r.llen(x) for x in priority_names])

如果您想完成一项实际任务，可以使用以下方法：

redis-cli -h HOST -p PORT -n DATABASE_NUMBER lrange QUEUE_NAME 0 -1

从那里，您必须反序列化返回的列表。就我而言，我能够通过以下方式完成此任务：

r = redis.StrictRedis(
    host=settings.REDIS_HOST,
    port=settings.REDIS_PORT,
    db=settings.REDIS_DATABASES['CELERY'],
)
l = r.lrange('celery', 0, -1)
pickle.loads(base64.decodestring(json.loads(l[0])['body']))

请注意，反序列化可能需要一些时间，您需要调整以上命令以使用各种优先级。

— 米利斯纳
source

在生产中使用此功能后，我了解到由于Celery的设计，如果您使用优先任务，该操作将失败。

— mlissner's

1

我已经更新了上面的内容以处理优先任务。进展！

— mlissner'5

1

只是为了说明问题，DATABASE_NUMBER默认情况下使用的是0，而QUEUE_NAMEis是celery，因此redis-cli -n 0 llen celery将返回排队的消息数。

— Vineet Bansal，

对于我的芹菜，队列的名字是'{{{0}}}{1}{2}'不是'{0}{1}{2}'。除此之外，这非常完美！

— zupo

12

要从后端检索任务，请使用此

from amqplib import client_0_8 as amqp
conn = amqp.Connection(host="localhost:5672 ", userid="guest",
                       password="guest", virtual_host="/", insist=False)
chan = conn.channel()
name, jobs, consumers = chan.queue_declare(queue="queue_name", passive=True)

— 灰
source

2

但是“工作”仅给出队列中的任务数量

— bitnik

有关为您提供任务名称的相关答案，请参见stackoverflow.com/a/57807913/9843399。

— Caleb Syring '19

10

如果您使用的是Celery + Django的最简单方法来检查任务，可直接从虚拟环境中的终端中使用命令或使用celery 的完整路径进行检查：

Doc：http : //docs.celeryproject.org/en/latest/userguide/workers.html? highlight = revoke#inspecting-workers

$ celery inspect reserved
$ celery inspect active
$ celery inspect registered
$ celery inspect scheduled

另外，如果您使用的是Celery + RabbitMQ，则可以使用以下命令检查队列列表：

更多信息：https : //linux.die.net/man/1/rabbitmqctl

$ sudo rabbitmqctl list_queues

— 亚历山大·S·
source

4

如果您有一个定义项目，则可以使用celery -A my_proj inspect reserved

— sashaboulouds

6

使用json序列化的Redis复制粘贴解决方案：

def get_celery_queue_items(queue_name):
    import base64
    import json  

    # Get a configured instance of a celery app:
    from yourproject.celery import app as celery_app

    with celery_app.pool.acquire(block=True) as conn:
        tasks = conn.default_channel.client.lrange(queue_name, 0, -1)
        decoded_tasks = []

    for task in tasks:
        j = json.loads(task)
        body = json.loads(base64.b64decode(j['body']))
        decoded_tasks.append(body)

    return decoded_tasks

它适用于Django。只是不要忘记改变yourproject.celery。

— 马克斯·马里什（Max Malysh）
source

1

如果您使用的是咸菜串行器，那么你可以改变body =线路body = pickle.loads(base64.b64decode(j['body']))。

— Jim Hunziker

4

芹菜检查模块似乎仅从工人角度了解任务。如果您想查看队列中的消息（但尚未由工作人员拉出），我建议使用pyrabbit，它可以与Rabbitmq http api进行接口以从队列中检索各种信息。

可以在此处找到一个示例：使用Celery（RabbitMQ，Django）检索队列长度

— 保罗在喧嚣
source

3

我认为获取正在等待的任务的唯一方法是保留启动任务的列表，并让任务在启动时将其自身从列表中删除。

使用rabbitmqctl和list_queues，您可以概述正在等待多少个任务，而不是任务本身：http ://www.rabbitmq.com/man/rabbitmqctl.1.man.html

如果要包含正在处理但尚未完成的任务，则可以保留任务列表并检查其状态：

from tasks import add
result = add.delay(4, 4)

result.ready() # True if finished

或者，您让Celery使用CELERY_RESULT_BACKEND存储结果，并检查其中没有哪些任务。

— 塞巴斯蒂安·布拉斯克（Sebastian Blask）
source

3

这在我的应用程序中为我工作：

def get_celery_queue_active_jobs(queue_name):
    connection = <CELERY_APP_INSTANCE>.connection()

    try:
        channel = connection.channel()
        name, jobs, consumers = channel.queue_declare(queue=queue_name, passive=True)
        active_jobs = []

        def dump_message(message):
            active_jobs.append(message.properties['application_headers']['task'])

        channel.basic_consume(queue=queue_name, callback=dump_message)

        for job in range(jobs):
            connection.drain_events()

        return active_jobs
    finally:
        connection.close()

active_jobs 将是与队列中的任务相对应的字符串列表。

不要忘记与您自己交换CELERY_APP_INSTANCE。

感谢@ashish将我的答案指向正确的方向：https ://stackoverflow.com/a/19465670/9843399

— 迦勒·赛林（Caleb Syring）
source

在我的情况下jobs总是为零...有什么想法吗？

— daveoncode

@daveoncode我认为信息不足以对我有所帮助。您可以打开自己的问题。如果您指定要在python中检索信息，我认为这不会是此副本的副本。我将返回到stackoverflow.com/a/19465670/9843399，这是我根据其答案得出的结论，并确保其首先生效。

— Caleb Syring

@CalebSyring这是真正向我展示排队任务的第一种方法。非常好。对我来说唯一的问题是列表追加似乎不起作用。任何想法如何使回调函数写入列表？

— Varlor

@Varlor对不起，有人对我的答案进行了不正确的编辑。您可以在编辑历史记录中查找原始答案，这很可能对您有用。我正在努力解决此问题。（编辑：我刚进去并拒绝了编辑，该编辑有一个明显的python错误。请告诉我这是否解决了您的问题。）

— Caleb Syring

@CalebSyring我现在在类中使用了您的代码，将列表作为类属性起作用了！

— Varlor

2

据我所知，Celery没有提供用于检查队列中正在等待的任务的API。这是特定于经纪人的。如果使用Redis作为代理，那么检查celery（默认）队列中正在等待的任务非常简单：

连接到代理数据库
列出列表中的项目celery（以LRANGE命令为例）

请记住，这些是等待有空工作人员选择的任务。您的群集中可能正在运行某些任务-由于已经选择了这些任务，因此它们不会出现在此列表中。

— 德扬·莱基奇
source

1

我得出的结论是，获得队列中作业数量的最佳方法是使用，rabbitmqctl正如这里多次建议的那样。为了允许任何选定的用户使用sudo我按照此处的说明运行命令（我跳过了概要文件部分的编辑，因为我不介意在命令前键入sudo。）

我还抓住了jamesc的代码段grep和cut片段，并将其包装在子流程调用中。

from subprocess import Popen, PIPE
p1 = Popen(["sudo", "rabbitmqctl", "list_queues", "-p", "[name of your virtula host"], stdout=PIPE)
p2 = Popen(["grep", "-e", "^celery\s"], stdin=p1.stdout, stdout=PIPE)
p3 = Popen(["cut", "-f2"], stdin=p2.stdout, stdout=PIPE)
p1.stdout.close()
p2.stdout.close()
print("number of jobs on queue: %i" % int(p3.communicate()[0]))

— 彼得·香农
source

1

from celery.task.control import inspect
def key_in_list(k, l):
    return bool([True for i in l if k in i.values()])

def check_task(task_id):
    task_value_dict = inspect().active().values()
    for task_list in task_value_dict:
        if self.key_in_list(task_id, task_list):
             return True
    return False

— 张朝龙
source

0

如果您控制任务的代码，则可以通过让任务在首次执行时触发一次重试来解决此问题，然后选中inspect().reserved()。重试将任务注册到结果后端，而celery可以看到。该任务必须接受self或context作为第一个参数，以便我们可以访问重试计数。

@task(bind=True)
def mytask(self):
    if self.request.retries == 0:
        raise self.retry(exc=MyTrivialError(), countdown=1)
    ...

此解决方案与代理无关，即。您不必担心使用的是RabbitMQ还是Redis来存储任务。

编辑：经过测试，我发现这只是部分解决方案。保留的大小限于工作程序的预取设置。

— Hedleyroos
source

0

与subprocess.run：

import subprocess
import re
active_process_txt = subprocess.run(['celery', '-A', 'my_proj', 'inspect', 'active'],
                                        stdout=subprocess.PIPE).stdout.decode('utf-8')
return len(re.findall(r'worker_pid', active_process_txt))

小心改变my_proj与your_proj

— 萨沙布
source