watchme.watchers package¶
Subpackages¶
Submodules¶
watchme.watchers.data module¶
Copyright (C) 2019 Vanessa Sochat.
This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
-
watchme.watchers.data.
export_dict
(self, task, filename, name=None, export_json=False, from_commit=None, to_commit=None, base=None)[source]¶ Export a data frame of changes for a filename over time.
- Parameters
task (the task folder for the watcher to look in)
name (the name of the watcher, defaults to the client’s)
base (the base of watchme to look for the task folder)
from_commit (the commit to start at)
to_commit (the commit to go to)
grep (the expression to match (not used if None))
filename (the filename to filter to. Includes all files if not specified.)
watchme.watchers.schedule module¶
Copyright (C) 2019 Vanessa Sochat.
This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
-
watchme.watchers.schedule.
clear_schedule
(self)[source]¶ clear all cron jobs associated with the watcher. To remove jobs associated with a single watcher, use remove_schedule
-
watchme.watchers.schedule.
get_crontab
(self)[source]¶ get an instance of the user’s crontab. We use the running user.
-
watchme.watchers.schedule.
get_job
(self, must_exist=True)[source]¶ return the job to the user, or None
-
watchme.watchers.schedule.
has_schedule
(self, must_exist=False)[source]¶ determine if a watcher already has a schedule, as a warning to the user.
-
watchme.watchers.schedule.
remove_schedule
(self, name=None, quiet=False)[source]¶ remove a scheduled item from crontab, this is based on the watcher name. By default, we use the watcher instance name, however you can specify a custom name if desired.
-
watchme.watchers.schedule.
schedule
(self, minute=12, hour=0, month='*', day='*', weekday='*', job=None, force=False)[source]¶ schedule the watcher to run at some frequency to update record of pages. By default, the task will run at 12 minutes passed midnight, daily. You can change the variables to change the frequency. See https://crontab.guru/ to get a setting that works for you.
Hourly: 0 * * * * Daily: 0 0 * * * (midnight) default weekly 0 0 * * 0 monthly 0 0 1 * * yearly 0 0 1 1 *
- Parameters
minute (must be within 1 and 60, or set to “” for every minute*)
hour (must be within 0 through 23 or set to *)
month (must be within 1 and 12, or *)
day (must be between 1 and 31, or *)
weekday (must be between 0 and 6 or *)
job (if provided, assumes we are updated an existing entry.)
-
watchme.watchers.schedule.
update_schedule
(self, minute=12, hour='*', month='*', day='*')[source]¶ update a scheduled item from the crontab, with a new entry. This first looks for the entry (and removes it) and then clls the new_ schedule function to write a new one. This function is intended to be used by a client from within Python, and isn’t exposed from the command line.
watchme.watchers.settings module¶
Copyright (C) 2019 Vanessa Sochat.
This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
-
watchme.watchers.settings.
get_section
(self, name)[source]¶ get a section from the config, if it exists
-
watchme.watchers.settings.
get_setting
(self, section, name, default=None)[source]¶ return a setting from the config, if defined. Otherwise return default (None or set by user)
- Parameters
section (the section in the config, defaults to self.name)
name (they key (index) of the setting to look up)
default ((optional) if not found, return default instead.)
-
watchme.watchers.settings.
has_section
(self, section)[source]¶ return a boolean if a config has a section (e.g., a task or exporter) :Parameters: section (the section in the config)
-
watchme.watchers.settings.
has_setting
(self, section, name)[source]¶ return a boolean if a config has a setting (or not) :Parameters: * section (the section in the config, defaults to self.name)
name (they key (index) of the setting to look up)
-
watchme.watchers.settings.
print_add_task
(self, task)[source]¶ assemble a task section into a command that can create/add it.
- Parameters
task (the name of the task to inspect)
-
watchme.watchers.settings.
print_section
(self, section)[source]¶ print a section (usually a task) from a configuration file, if it exists.
- Parameters
section (the name of the section (task))
-
watchme.watchers.settings.
remove_section
(self, section, save=True)[source]¶ remove a setting from the configuration file
- Parameters
section (the name of the section (task))
save (save the configuration file (default is True))
-
watchme.watchers.settings.
remove_setting
(self, section, name, save=True)[source]¶ remove a setting from the configuration file
- Parameters
section (the name of the section (task))
name (the name of the variable to remove)
save (save the configuration file (default is True))
-
watchme.watchers.settings.
set_setting
(self, section, key, value)[source]¶ set a key value pair in a section, if the section exists. Returns a boolean (True or False) to indicate if added.
- Parameters
section (the section in the config, defaults to self.name)
key (they key (index) of the setting to set)
value (the value to set.)
Module contents¶
Copyright (C) 2019 Vanessa Sochat.
This Source Code Form is subject to the terms of the Mozilla Public License, v. 2.0. If a copy of the MPL was not distributed with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
-
class
watchme.watchers.
Watcher
(name=None, base=None, create=False, **kwargs)[source]¶ Bases:
object
-
add_task
(task, task_type, params, force=False, active='true')[source]¶ add a task, meaning ensuring that the type is valid, and that the parameters are valid for the task.
- Parameters
task (the Task object to add, should have a name and params and) – be child of watchme.tasks.TaskBase
task_type (must be in WATCHME_TASK_TYPES, meaning a client exists)
params (list of parameters to be validated (key@value))
force (if task already exists, overwrite)
active (add the task as active (default “true”))
-
clear_schedule
()¶ clear all cron jobs associated with the watcher. To remove jobs associated with a single watcher, use remove_schedule
-
configfile
= None¶
-
deactivate
(task=None)[source]¶ turn the active status of a watcher to false. If a task is provided, update the config value for the task to be false.
-
edit_task
(name, action, key, value=None)[source]¶ edit a task, meaning doing an addition (add), update (update), or “remove”, All actions require a value other than remove.
- Parameters
name (the name of the task to update)
action (the action to take (update, add, remove) a parameter)
key (the key to update)
value (the value to update)
-
export_dict
(task, filename, name=None, export_json=False, from_commit=None, to_commit=None, base=None)¶ Export a data frame of changes for a filename over time.
- Parameters
task (the task folder for the watcher to look in)
name (the name of the watcher, defaults to the client’s)
base (the base of watchme to look for the task folder)
from_commit (the commit to start at)
to_commit (the commit to go to)
grep (the expression to match (not used if None))
filename (the filename to filter to. Includes all files if not specified.)
-
finish_runs
(results)[source]¶ finish runs should take a dictionary of results, with keys as the folder name, and for each, depending on the result type, write the result to file (or update file) and then commit to git.
- Parameters
results (a dictionary of tasks, with keys as the task name, and) – values as the result.
-
freeze
()[source]¶ freeze a watcher, meaning that it along with its tasks cannot be deleted. This does not prevent the user from manual editing.
-
get_crontab
()¶ get an instance of the user’s crontab. We use the running user.
-
get_decorator
(name)[source]¶ instantiate a task object for a decorator. Decorators must start with “decorator-” and since they are run on the fly, we don’t find them in the config.
- Parameters
name (the name of the task to load)
-
get_job
(must_exist=True)¶ return the job to the user, or None
-
get_section
(name)¶ get a section from the config, if it exists
-
get_setting
(section, name, default=None)¶ return a setting from the config, if defined. Otherwise return default (None or set by user)
- Parameters
section (the section in the config, defaults to self.name)
name (they key (index) of the setting to look up)
default ((optional) if not found, return default instead.)
-
get_task
(name)[source]¶ get a particular task, based on the name. This is where each type of class should check the “type” parameter from the config, and import the correct Task class.
- Parameters
name (the name of the task to load)
-
get_tasks
(regexp=None, quiet=False, active=True)[source]¶ get the tasks for a watcher, possibly matching a regular expression. A list of dictionaries is returned, each holding the parameters for a task. “uri” will hold the task (folder) name, active
- Parameters
regexp (if supplied, the user wants to run only tasks that match) – a particular pattern
quiet (If quiet, don’t print the number of tasks found)
active (only return active tasks (default True))
-
has_schedule
(must_exist=False)¶ determine if a watcher already has a schedule, as a warning to the user.
-
has_section
(section)¶ return a boolean if a config has a section (e.g., a task or exporter) :Parameters: section (the section in the config)
-
has_setting
(section, name)¶ return a boolean if a config has a setting (or not) :Parameters: * section (the section in the config, defaults to self.name)
name (they key (index) of the setting to look up)
-
inspect
(tasks=None, create_command=False)[source]¶ inspect a watcher, or one or more tasks belonging to it. This means printing the configuration for the entire watcher (if tasks is None) or just for one or more tasks.
- Parameters
tasks (one or more tasks to inspect (None will show entire file))
create_command (if True, given one or more tasks, print the command) – to create them.
-
is_active
(task=None)[source]¶ determine if the watcher is active by reading from the config directly if a task name is provided, check the active status of the task
-
is_frozen
()[source]¶ return a boolean to indicate if the watcher is frozen. protected indicates no delete to the watcher, but allowed delete to tasks, frozen indicates no change of anything.
-
is_protected
()[source]¶ return a boolean to indicate if the watcher is protected or frozen. protected indicates no delete to the watcher, but allowed delete to tasks, frozen indicates no change of anything.
-
load_config
()[source]¶ load a configuration file, and set the active setting for the watcher if the file doesn’t exist, the function will exit and prompt the user to create the watcher first. If the watcher section isn’t yet defined, it will be written with a default active status set to false.
-
print_add_task
(task)¶ assemble a task section into a command that can create/add it.
- Parameters
task (the name of the task to inspect)
-
print_section
(section)¶ print a section (usually a task) from a configuration file, if it exists.
- Parameters
section (the name of the section (task))
-
protect
(status='on')[source]¶ protect a watcher, meaning that it cannot be deleted. This does not influence removing a task. To freeze the entire watcher, use the freeze() function.
-
remove_schedule
(name=None, quiet=False)¶ remove a scheduled item from crontab, this is based on the watcher name. By default, we use the watcher instance name, however you can specify a custom name if desired.
-
remove_section
(section, save=True)¶ remove a setting from the configuration file
- Parameters
section (the name of the section (task))
save (save the configuration file (default is True))
-
remove_setting
(section, name, save=True)¶ remove a setting from the configuration file
- Parameters
section (the name of the section (task))
name (the name of the variable to remove)
save (save the configuration file (default is True))
-
remove_task
(task)[source]¶ remove a task from the watcher repo, if it exists, and the watcher is not frozen.
- Parameters
task (the name of the task to remove)
-
repo
= None¶
-
run
(regexp=None, parallel=True, test=False, show_progress=True)[source]¶ run the watcher, which should be done via the crontab, including:
- checks: the instantiation of the client already ensures that
the watcher folder exists, and has a configuration, and it loads.
parse: parse the tasks to be run
start: run the tasks that are defined for the watcher.
finish: after completion, commit to the repository changed files
- Parameters
regexp (if supplied, the user wants to run only tasks that match) – a particular pattern
parallel (if True, use multiprocessing to run tasks (True)) – each watcher should have this setup ready to go.
test (run in test mode (no saving of results))
show_progress (if True, show progress bar instead of task information) – (defaults to True)
-
run_tasks
(queue, parallel=True, show_progress=True)[source]¶ this run_tasks function takes a list of Task objects, each potentially a different kind of task, and extracts the parameters with task.export_params(), and the running function with task.export_func(), and hands these over to the multiprocessing worker. It’s up to the Task to return some correct function from it’s set of task functions that correspond with the variables.
Examples
funcs {‘task-reddit-hpc’: <function watchme.watchers.urls.tasks.get_task>}
tasks {‘task-reddit-hpc’: [(‘url’, ‘https://www.reddit.com/r/hpc’),
(‘active’, ‘true’), (‘type’, ‘urls’)]}
-
schedule
(minute=12, hour=0, month='*', day='*', weekday='*', job=None, force=False)¶ schedule the watcher to run at some frequency to update record of pages. By default, the task will run at 12 minutes passed midnight, daily. You can change the variables to change the frequency. See https://crontab.guru/ to get a setting that works for you.
Hourly: 0 * * * * Daily: 0 0 * * * (midnight) default weekly 0 0 * * 0 monthly 0 0 1 * * yearly 0 0 1 1 *
- Parameters
minute (must be within 1 and 60, or set to “” for every minute*)
hour (must be within 0 through 23 or set to *)
month (must be within 1 and 12, or *)
day (must be between 1 and 31, or *)
weekday (must be between 0 and 6 or *)
job (if provided, assumes we are updated an existing entry.)
-
set_setting
(section, key, value)¶ set a key value pair in a section, if the section exists. Returns a boolean (True or False) to indicate if added.
- Parameters
section (the section in the config, defaults to self.name)
key (they key (index) of the setting to set)
value (the value to set.)
-
unfreeze
()[source]¶ freeze a watcher, meaning that it along with its tasks cannot be deleted. This does not prevent the user from manual editing.
-
update_schedule
(minute=12, hour='*', month='*', day='*')¶ update a scheduled item from the crontab, with a new entry. This first looks for the entry (and removes it) and then clls the new_ schedule function to write a new one. This function is intended to be used by a client from within Python, and isn’t exposed from the command line.
-