watchme

The psutils watcher is one of @vsoch favorites. It will help to monitor general system resources (memory, networks, sensors, cpu, users) along with basic python environment. If your python installation doesn’t have the psutil module, install as follows:

pip install watchme[watcher-psutils]

Next, create a watcher for your tasks to live under:

$ watchme create system

Now let’s walk through each of the tasks. If you aren’t familiar with how to add a task and otherwise manage them, see the getting started docs. Here are the functions exposed by psutils:

Add a Task

Remember that we are going to be added tasks to our watcher, “system” above. The general format to add a task looks like this:

$ watchme add-task <watcher> <task-name> key1@value1 key2@value2

The key and value pairs are going to vary based on the watcher task.

Task Parameters

The set of psutil tasks don’t have any required arguments, but for each you can define a comma separated list of entries to skip.

Name Required Default Example Notes
skip No undefined skip@ varies by task (see below)

Also note that for all of the tasks below, you need to select the type of task with --type psutils.

Task Environment

For any task, if you have a variable exporter to the environment that starts with WATCHMEENV_, it will be found and added to the result. For example, WATCHMEENV_result_id=123 will be saved to the results dictionary with key “result_id” set to “123.”

Return Values

All of the return values here will be dictionaries, meaning appropriate to export to json. You shouldn’t need to define a file_name parameter, as it will be set automatically to be your host plus username. However, you are free to set this parameter to change this value.

1. The Monitor Pid Task

This task is useful for monitoring a specific process. You can run it as a task (discussed first) or a decorator to a function to enable continuous monitoring. Either way, you will want to create your watcher first:

$ watchme create system

Run on the Fly, Python

It’s most likely you want to run and monitor a command on the fly, either from within Python or the command line. First, let’s take a look at running from within Python. We are going to use a TerminalRunner to run the command:

from watchme.tasks.decorators import TerminalRunner
runner = TerminalRunner('sleep 5')
runner.run()
timepoints = runner.wait()

You’ll get a list of timepoints, collected at intervals of 3 seconds! Here is the top of the first:

[{'SECONDS': '3',
  'cmdline': ['sleep', '5'],
  'connections': [],
  'cpu_affinity': [0, 1, 2, 3],
  'cpu_num': 1,
  'cpu_percent': 0.0,
  'cpu_times': {'children_system': 0.0,
...

Do you need to add data to the structure? Just export it with prefix WATCHMEENV_*

os.environ['WATCHMEENV_AVOCADO'] = '5'

and it will appear in the result!

 {'AVOCADO': '5',
  'SECONDS': '3',
  'cmdline': ['sleep', '5'],
  'connections': [],
  'cpu_affinity': [0, 1, 2, 3],
  'cpu_num': 3,
...

If you want to skip fields, include fields, or use “only” a subset of the keys returned in the default result, you can define lists for only, include, and skip to the TerminalRunner directly (optional).

Run on the Fly, Command Line

If you choose, the same function can be run via the watchme command line client. If you provide no additional arguments, it will print the data structure to the screen:

$ watchme monitor sleep 2
[{"cpu_times": {"user": 0.0, "system": 0.0, "children_user": 0.0, "children_system": 0.0}, "ppid": 14676, "open_files": 0, "terminal": "/dev/pts/1", "cpu_percent": 0.0, "ionice": {"ioclass": "IOPRIO_CLASS_NONE", "value": 4}, "uids": {"real": 1000, "effetive": 1000, "saved": 1000}, "cmdline": ["sleep", "2"], "memory_full_info": {"rss": 872448, "vms": 7651328, "shared": 802816, "text": 28672, "lib": 0, "data": 319488, "dirty": 0, "uss": 118784, "pss": 135168, "swap": 0}, "gids": {"real": 1000, "effetive": 1000, "saved": 1000}, "connections": [], "pid": 14680, "num_fds": 3, "io_counters": {"read_count": 6, "write_count": 0, "read_bytes": 0, "write_bytes": 0, "read_chars": 1948, "write_chars": 0}, "exe": "/bin/sleep", "nice": 0, "status": "sleeping", "create_time": 1559848190.51, "name": "sleep", "cwd": "/home/vanessa/Desktop/watchme", "username": "vanessa", "cpu_num": 2, "num_threads": 1, "num_ctx_switches": {"voluntary": 1, "involuntary": 0}, "cpu_affinity": [0, 1, 2, 3], "memory_percent": 0.004172553297022325, "SECONDS": "3"}]

Want to add an environment variable? The same applies - you can export WATCHMEENV_* and they will be added to results.

$ export WATCHMEENV_AVOCADOS=3
$ watchme monitor sleep 2
[{"cpu_times": {"user": 0.0, "system": 0.0, "children_user": 0.0, "children_system": 0.0}, "ppid": 14676, "open_files": 0, "terminal": "/dev/pts/1", "cpu_percent": 0.0, "ionice": {"ioclass": "IOPRIO_CLASS_NONE", "value": 4}, "uids": {"real": 1000, "effetive": 1000, "saved": 1000}, "cmdline": ["sleep", "2"], "memory_full_info": {"rss": 872448, "vms": 7651328, "shared": 802816, "text": 28672, "lib": 0, "data": 319488, "dirty": 0, "uss": 118784, "pss": 135168, "swap": 0}, "gids": {"real": 1000, "effetive": 1000, "saved": 1000}, "connections": [], "pid": 14680, "num_fds": 3, "io_counters": {"read_count": 6, "write_count": 0, "read_bytes": 0, "write_bytes": 0, "read_chars": 1948, "write_chars": 0}, "exe": "/bin/sleep", "nice": 0, "status": "sleeping", "create_time": 1559848190.51, "name": "sleep", "cwd": "/home/vanessa/Desktop/watchme", "username": "vanessa", "cpu_num": 2, "num_threads": 1, "num_ctx_switches": {"voluntary": 1, "involuntary": 0}, "cpu_affinity": [0, 1, 2, 3], "memory_percent": 0.004172553297022325, "avocados": "3", "SECONDS": "3"}]

If you want to save to a watcher, then provide the watcher name as the first argument. For example, here we run a task on the fly, and save the result to the watcher “decorator.” Since we don’t provide a --name argument, the name defaults to a derivation of the command run.

$ watchme monitor decorators sleep 2
$ WATCHMEENV_AVOCADOS=3 watchme monitor decorator sleep 2

List the folders in the watcher named “decorator” to see the newly added result:

$ watchme list decorator
watcher: /home/vanessa/.watchme/decorator
task-monitor-slack
  decorator-psutils-sleep-2
  .git
  decorator-psutils-noenv
  decorator-psutils-myfunc
  watchme.cfg

And then use export to export the data!

$ watchme export decorator decorator-psutils-sleep-2 result.json --json
git log --all --oneline --pretty=tformat:"%H" --grep "ADD results" 7a7cb5535c96e06433af9c47485ba253137e580f..b8f155c66819a646405cd710eca150396118fe7c -- decorator-psutils-sleep-2/result.json
{
    "commits": [
        "b8f155c66819a646405cd710eca150396118fe7c"
    ],
    "dates": [
        "2019-05-17 16:47:19 -0400"
    ],
    "content": [
        {
            "memory_full_info": {
                "rss": 688128,
                "vms": 7467008,
                "shared": 614400,
                "text": 28672,
                "lib": 0,
                "data": 323584,
...

For both of the command line above, you can define --name to give a custom name, or --seconds to set the interval at which to collect metrics (default is 3).

$ watchme monitor sleep 2 --seconds 1

And along with the interactive Python version, you can optionally specify a comma separated value string of keys to include, skip, or only use. Here we skip two fields:

$ watchme monitor sleep 2 --skip memory_full_info,cmdline

And here we only care about the command line:

$ watchme monitor sleep 2 --only cmdline
[{'cmdline': ['sleep', '2'], 'SECONDS': '3'}]

Notice that your custom environment variables are not skipped, only those in the content of the default parameters. Here is a rundown of all of what is discussed above:

And there you have it! With these methods to monitor any process on the fly at a particular interval, you are good to go!

Run as a Task

To run as a task, you will want to provide func@monitor_pid_task when you create the task. You are also required to add a pid, and this can be a number, or the name of the process. Either way, likely you would be running this for a long running process, and in the case of providing an integer, you would need to update the pid of the process when you restart your computer. Here is an example adding a process name I know will be running when my computer starts:

$ watchme add-task system task-monitor-slack --type psutils func@monitor_pid_task pid@slack
[task-monitor-slack]
func  = monitor_pid_task
pid  = slack
active  = true
type  = psutils

If you choose a process name, remember that different processes can have the same name (for example, think about the case of having multiple Python interpreters open!) This means that watchme will select the first in the list that it finds. If you have preference for a specific one, then it’s recommended that you provide the process id directly.

Customize

You can also add a parameter called “skip”, including one or more (comma separate) indices in the results to skip.

$ watchme add-task system task-monitor-slack --type psutils func@monitor_pid_task pid@slack skip@cpu_freq,cpu_percent

By default, you’ll get a lot of data! You should skip those you don’t need:

We purposefully don’t include the environment (environ) because it tends to not change, and more importantly, we want to not expose sensitive information. If you want to add this back in, you can do that:

$ watchme add-task system task-monitor-slack --type psutils func@monitor_pid_task pid@slack include@environ

We don’t show threads because it would make the data too big, but we do include num_threads.

To test out the task, you can do the following:

$ watchme run system task-monitor-slack --test

You’ll see the results.json print to the screen! When it’s time to use the watcher, you can active and schedule it.

Use as a Decorator

Although the parameters are not stored in the watchme.cfg, if you use the decorator to run the same task, the .git repository is still used as a database, and you can collect data to monitor your different Python functions on the fly. Specifically, the decorator is going to use multiprocessing to run your function, and then watch it via the process id (pid). You get to choose how often (in seconds) you want to record metrics like memory, io counters, and cpu, and threads. See here for an example of default output for one timepoint. This decorator function uses the same task function, that we discussed first, but with a different invocation.

from watchme.tasks.decorators import monitor_resources
from time import sleep

@monitor_resources('system', seconds=3)
def myfunc():
    long_list = []
    for i in range(100):
        long_list = long_list + (i*10)*['pancakes']
        print("i is %s, sleeping 10 seconds" % i)
        sleep(10)

The first argument is the name of the watcher (e.g., system) and you are also allowed to specify the following arguments (not shown):

Why don’t you need specify a pid? Your running function will produce the process id, so you don’t need to provide it. Let’s run this script. You can get the full thing from the gist here. You’ll notice in the gist example we demonstrate “myfunc” taking arguments too.

$ python test_psutils_decorator.py
Calling myfunc with 2 iters
Generating a long list, pause is 2 and iters is 2
i is 0, sleeping 2 seconds
i is 1, sleeping 2 seconds
Result list has length 10

Great! So it ran for the watcher we created called system, but where are the results? Let’s take a look in our watcher folder:

~/.watchme/system$ tree
.
├── decorator-psutils-myfunc
│   ├── result.json
│   └── TIMESTAMP
├── task-monitor-slack
│   ├── result.json
│   └── TIMESTAMP
└── watchme.cfg

2 directories, 5 files

In addition to the task that we ran, “task-monitor-slack,” we also have results in a new “decorator-psutils-myfunc” folder. You’ve guessed it right - the decorator namespace creates folders of the format decorator-psutils-<name>, where name is the name of the function, or a “name” parameter you provide to the decorator.

What is a result?

Remember that we are monitoring our function every 3 seconds, so for a function that lasts about 10, we will record process metrics three times. How would we export that data? Like this:

$ watchme export system decorator-psutils-myfunc result.json --json

We ask for --json because we know the result is provided in json. For the above export, we will find three commits, each with a commit id, timestamp, and full result:

git log --all --oneline --pretty=tformat:"%H" --grep "ADD results" 7a7cb5535c96e06433af9c47485ba253137e580f..03b793dfe708c310f32526041775ec38449ccd54 -- decorator-psutils-myfunc/result.json
{
    "commits": [
        "03b793dfe708c310f32526041775ec38449ccd54",
        "71012b7f2b5d247318b2dcf187ee2c823ad7ef63",
        "e1d06f86eac18cc6d54d3c8a62aeede7f8b85bac"
    ],
    "dates": [
        "2019-05-11 12:31:49 -0400",
        "2019-05-11 12:31:20 -0400",
        "2019-05-11 12:28:45 -0400"
    ],
    "content": [
        {
            "cpu_percent": 0.0,
            "cpu_num": 3,
...

And each entry coincides with one collection of data during the task run. You can plot different metrics over time to see the change in the process resource usage. If you are interested in what a (default) output will look like, see the gist here.

2. The CPU Task

This task will report basic cpu parameters. You add it by selection of the parameter func@cpu_task

$ watchme add-task system task-cpu --type psutils func@cpu_task
[watcher|system]
[task-cpu]
func  = cpu_task
active  = true
type  = psutils

In the above, we added the task “task-cpu” to the watcher called “system” If you don’t define file_name it will be saved as a json file in the task folder, named according to your user and host. The following attributes will be retrieved (and you can remove any with the comma separated list “skip.”

For example, to skip the first two, you would add the watcher like this:

$ watchme add-task system task-cpu --type psutils func@cpu_task skip@cpu_freq,cpu_percent

For more information on the psutil functions for cpu, see here.

3. The Memory Task

This task will report stats on virtual memory. You add it by selection of the parameter func@memory_task

$ watchme add-task system task-memory --type psutils func@memory_task
[task-memory]
func  = memory_task
active  = true
type  = psutils

You don’t need to use “skip” for this task because there is only one entry returned in the dictionary (“virtual_memory”) so if you don’t want this information, just don’t run the task.

For more information on the psutil functions for memory, see here.

4. The Networking Task

This task will report a bunch of network statistics. This is one that you might want to be careful with, since it exports a lot of your host networking information that might be sensitive if freely available. I would read about the different parameters here first. stats on virtual memory. You add it by selection of the parameter func@net_task. In the example below, I’m going to skip ‘net_if_address’ and “net_connections”

$ watchme add-task system task-network --type psutils func@net_task skip@net_connections,net_if_address
[task-network]
func  = net_task
skip  = net_connections,net_if_address
active  = true
type  = psutils

All the possible attributes you can get are:

If you want to see what is exported, just try importing the task into a python terminal and running it:

from watchme.watchers.psutils.tasks import net_task
net_task()

For more information on the psutil functions for networking, see here.

5. The Python Task

This task will report information about install location, modules, and version of the Python running the task. To set up this task:

$ watchme add-task system task-python --type psutils func@python_task
[task-python]
func  = python_task
active  = true
type  = psutils

All the possible attributes you can get (and disclude with the skip parmeter are):

Most of these functions fall under psutil.os and psutil.os.sys. There are quite a few, so if you think something is missing please open an issue.

6. The Sensors Task

This is one of the coolest! You can actually use psutil to get information on your system fans, temperature, and even battery. Set up the task like this:

$ watchme add-task system task-sensors --type psutils func@sensors_task
[task-sensors]
func  = sensors_task
active  = true
type  = psutils

All the possible attributes you can get (and disclude with the skip parmeter are):

See the sensors documentation for details on what is included.

7. The System Task

The system task will provide (basic) system information. This is another task that if you see something missing (that you think should be there) you should open an issue. Here is how to add the task:

$ watchme add-task system task-system --type psutils func@system_task
[task-system]
func  = system_task
active  = true
type  = psutils

All the possible attributes you can get (and disclude with the skip parmeter are):

See the system documentation for details on what is included. A lot of these params are taken from psutil.sys.

8. The Users Task

The last is the user task, which will export active users on the system. It’s likely that this task result won’t change over time.

$ watchme add-task system task-users --type psutils func@users_task
[task-users]
func  = users_task
active  = true
type  = psutils

There is only one entry in the result (users) so if you don’t want this, just don’t run the task. users is a function under the system documentation.

Verify the Addition

Once you add one or more tasks, you can inspect your watcher configuration file at $HOME/.watchme/system/watchme.cfg:

$ cat $HOME/.watchme/system/watchme.cfg

You can also use inspect:

$ watchme inspect system

At this point, you can test running the watcher with the --test flag for run:

$ watchme run system --test

and when you are ready, activate the entire watcher to run:

$ watchme activate system
$ watchme run system

And don’t forget to set a schedule to automated it:

$ watchme schedule system @hourly
$ crontab -l
@hourly /home/vanessa/anaconda3/bin/watchme run system # watchme-system

Every time your task runs, new files (or old files will be updated) and you can choose to push the entire watcher folder to GitHub to have reproducible monitoring! Since the parameter result files are named based on your host and username, others could fork the repo, run on their system, and pull request to combine data.

When you are ready, read more about exporting data.