I’m excited to announce the first release of QueueMe (qme), a task management tool for Python that is both reproducible and easy to use. The base idea was imagined by my colleagues Yarik, and I loved it so much that I took it and ran with it! In the last week or so I’ve had immense fun developing qme, and want to share some of the things I’m excited about. For those that don’t want to read, you can jump right into the documentation at vsoch.github.io/qme or the codebase at vsoch/qme.

docs/assets/img/logo/logo-small.png

The beautiful logo above was a collaboration with nsoch, and is also beautifully shown on the pages for the Python docs.

What is QueueMe (qme)?

QueueMe (on the command line, qme) is a jobs queue and dashboard generation tool that can be used to specify executors (entities that run jobs) and actions for them. You can use qme only on the command line in a headless environment, or if desired, via an interactive web dashboard. The dashboard exposes basic operations for tasks, along with a RESTful application programming interface (API). In both cases, you can customize using (or not using) a database, along with setting executor-specific arguments that might be available.

How do I use it?

QueueMe works as a decorator for any kind of command that you would normally run on the terminal, and it also can be used within Python.

Terminal

After you install qme, you can run a task, for example, listing files in the present working directory.

$ qme run ls

you can then get the task via the command line:

$ qme get

or list all tasks for a particular executor (e.g., shell):

$ qme ls shell

or search across all metadata and task commands for a query of interest:


$ qme search moto
Database: sqlite
1  shell-de58f61b-81da-467c-981c-497f7ae8556b	2020-05-22 17:47:37	echo Hello Moto
2  shell-c231699a-4c3e-43f0-961f-2829d16d588c	2020-05-22 17:47:45	echo Hello Another Moto

or just list all tasks


$ qme ls
Database: sqlite
1  shell-38f2535a-e38f-4bc4-8667-9c43726b1e7e	ls
2  shell-de58f61b-81da-467c-981c-497f7ae8556b	echo Hello Moto
3  shell-c231699a-4c3e-43f0-961f-2829d16d588c	echo Hello Another Moto

and then get a specific task id (we will show the full output this time):


$ qme get shell-c231699a-4c3e-43f0-961f-2829d16d588c
Database: sqlite
{
    "executor": "shell",
    "uid": "shell-c231699a-4c3e-43f0-961f-2829d16d588c",
    "data": {
        "pwd": "/home/vanessa/Desktop/Code/qme",
        "user": "vanessa",
        "timestamp": "2020-05-22 11:47:45.282222",
        "output": [
            "Hello Another Moto\n"
        ],
        "error": [],
        "returncode": 0,
        "pid": 9533,
        "cmd": [
            "echo",
            "Hello",
            "Another",
            "Moto"
        ],
        "status": "complete"
    },
    "command": "echo Hello Another Moto"
}

You can also clear, meaning removing all tasks, tasks for a specific executor, or a task with a specific id:

# clear all tasks in the database
$ qme clear 

# clear tasks for the shell executor
$ qme clear shell

# delete a specific task based on taskid
$ qme clear shell-a561702d-404e-4fb2-be27-57496b32ac46

Each of the above asks for a confirmation from you first, whether you are issuing the command in the terminal or the interactive dashboard. You can also search, change your default configuration, or start an interactive interface:

$ qme search <query>
$ qme config --database sqlite
$ qme start

You can see a complete listing of commands here

Python

If you are developing or using a Python application, this means that you can interact with your same QueueMe, but from within Python! It works by creating your queue:


from qme.main import Queue
queue = Queue()

And then running a task

task = queue.run("sbatch --partition owners --time 00:00:10 run_job.sh")
[slurm-8e70abab-fe8b-43cb-b108-b1b1da725cac][returncode: 0]

Or get a previously run task (or without a taskid, the last run task):

task = queue.get()
<Task 'shell-c231699a-4c3e-43f0-961f-2829d16d588c'>

You can inspect metadata:


> task.load()
{'executor': 'slurm',
 'uid': 'slurm-8e70abab-fe8b-43cb-b108-b1b1da725cac',
 'data': {'pwd': '/home/users/vsochat',
  'user': 'vsochat',
  'timestamp': '2020-05-20 16:01:33.169770',
  'output': ['Submitted batch job 906448\n'],
  'error': [],
  'returncode': 0,
  'command': ['sbatch',
   '--partition',
   'owners',
   '--time',
   '00:00:10',
   'run_job.sh'],
  'status': 'complete',
  'pid': 127569},
 'command': 'sbatch --partition owners --time 00:00:10 run_job.sh'}

When you load a task, it includes higher level metadata like the taskid. If you just want the task-specific metadata, use task.export()


task.export()                                                                                                                         

{'pwd': '/home/vanessa/Desktop/Code/qme',
 'user': 'vanessa',
 'timestamp': '2020-05-22 11:47:45.282222',
 'output': ['Hello Another Moto\n'],
 'error': [],
 'returncode': 0,
 'pid': 9533,
 'cmd': ['echo', 'Hello', 'Another', 'Moto'],
 'status': 'complete'}

or use any of the same functions that are exposed to the command line client.

# delete all or a subset of tasks
queue.clear()          
queue.clear("shell")
queue.clear("shell-123456")

queue.list()
queue.list("shell")

queue.search("moto")

# re-run the last task, or a named taskid
queue.rerun() 
queue.rerun("shell-123456")

# get the last task, or a named taskid
queue.get() 
queue.get("shell-123456")

This means that you can integrate saving tasks (and important metadata) into your scientific pipelines, even if you don’t need the full dashboard. Note that you can choose your database backend, including traditional relational, sqlite (recommended) and a flat filesystem.

What is an executor?

An executor is a specific parser for a command, which is determined based on regular expressions to match the command. For example, a “datalad” parser might match any command given to qme run that starts with “datalad” and a “slurm” parser might match anything that starts with sbatch The parser can then further parse the specific command. Along with parsing the command, the executor can then:

  • capture specific metadata important to know (e.g., pwd, output, error, return code, username)
  • further check the command for correctness, and tell the user how to improve or fix it if needed.
  • define custom actions to run for the command (e.g., a slurm executor exposes a status function to the user)
  • define a custom interface for displaying the actions and metadata parsed.

Executors can be created for general command line tools, serving as a wrapper:

$ qme run qsub myscript.sh
$ qme run ls

or even created for custom use cases that don’t require a command line executable at all! For example, we might define a MadLibs executor that takes an input file with a list of words, and generates a random MadLib for the user.

$ qme run madlib mywords.txt

That’s the cool part about qme - there is huge freedom in defining what an executor is, what an executor can do, and what user interface is exposed for the results.

What do I get with the dashboard?

The interactive session means opening up a web interface that shows an interactive table that updates automatically with changed or new tasks via web sockets (cool!) This is a quick shot of the dashboard:

and a view for a specific task, in this case, a shell command that we ran.

The dashboard also exposes a complete set of API endpoints.

Example Use Cases?

QueueMe is intended to help you organize your many command line tasks, which means:

  • remembering the commands that you ran
  • being able to request actions (e.g., ping for a status)
  • being able to easily search or get metadata about a particular command

More specifically, qme provides a layer of reproducibility to your terminal usage, because instead of spitting out commands that you do not remember or doing a grep to search your linux history, you instead store the commands in a database. The commands are parsed to matched executors of interest (e.g. slurm would match srun and expose commands to interact with your submission) and if no executor is matched, it’s treated as a standard Shell command (shell capture standard output, error, and return codes).

A good example is the Docker Tutorial, which shows how to package QueueMe alongside some scientific code in a container, and then be able to save metadata for every task run, and display the tasks in your interactive dashboard. In fact, if you extend this to using a Singularity container such as with the Singularity example, you can have a completely seamless environment between your QueueMe dashboard and running tasks in containers or on the host.

Next Steps

I’ve written up a guide for How to create an executor and I’m looking for command-line or Python based applications that could warrant having a nice interface. For example, Yarik and I talked about Datalad and ReproNim as good contenders. I suspect other Python libraries with some kind of tasks to be monitoried or have metadata collected for would be useful too! Do you have an idea for an executor? Please let me know!

Where do I go from here?

A good place to start is the getting started page, which has links for getting started with writing tests, running tests, and many examples. Otherwise, check out: