User Guide
Pipelib is a library for creating pipelines. You can parse, compare, and order iterables. With Pipelib you can:
Create a custom pipeline to parse and compare version strings
Use a collection of provided sorting functions for custom sorts.
Assemble different processing blocks to pre-process inputs first.
If you have a question, find a bug, or want to request a feature! This is an open source project and we are eager for your contribution. 🎉️
Concepts
Pipelib has a few concepts:
a pipeline is a collection of steps that take, as input, a listing of items and return a parser and filtered list
a step is some action in a pipeline. There are different kinds of steps - filter steps are boolean steps, meaning functions that return True/False to indicate if the item should be kept. - transform steps take the initial input and return a different version. If the resulting item is empty or None, is it not included. - sort performs some kind of specialized sort or ordering, usually expecting a list with something sortable. - custom a custom step usually can perform any kind of operation (or more than one), as an example a step to filter and sort container tags.
a wrapper is (exactly that) - an internal wrapper class to an item. Wrappers are used inside steps and allow for things like sorting and comparison. You probably don’t need to worry about wrappers unless you want to develop for pipelib.
Pipelines are composeable, meaning that you can insert an entire pipeline into another pipeline as a step.
Usage
Once you have pipelib
installed (Installation) you
can parse your errors fairly easily, either using ohno as a wrapper or post-run
log parser.
A Simple Example
Here is a simple example to process and filter a list of strings:
import pipelib.steps as step
import pipelib.pipeline as pipeline
# A pipeline to process a list of strings
steps = (
# convert everything to lowercase
step.transform.ToLowercase(),
# don't include anything with "two"
~step.filters.HasPatterns(filters=["two"])
)
# Strings to process
items = ['item-ONE', 'item-TWO', 'item-two-THREE']
p = pipeline.Pipeline(steps)
# The updated and transformed items
updated = p.run(items)
# ['item-one']
In the above, you can always use the ~ symbol to reverse the functionality of a step. E.g., a step named steps.HasMinLength() will return True given that an item has a min length that you’ve provided, and the item will be kept for further processing in the pipeline. However, ~steps.HasMinLength() will do the opposite, not including those same items that have the min length (and keeping those that do not).
Pipeline Logic
Steps are composable, meaning that you can chain them together into logical statements. As an example, let’s say part of my processing needs to determine if a string has a commit reference, where generally I want to check:
the length is >= 10
there are not all letters
It wouldn’t work to check all of these separately (as their own steps) because I want them grouped together as one condition, e.g.,
> Don’t keep if the length is >= 10 AND there are not all letters
We can thus compose steps into this logic as follows:
import pipelib.steps as step
import pipelib.pipeline as pipeline
# We want to keep those length >= 10 and not all letters
tags = [
'0.9.24--ha87ae23_0',
'0.9.19--1',
'0.9.14--1',
'ishouldberemoved',
'0.9.10--hdbcaa40_3']
# A pipeline to process docker tags
steps = (
# Example of chaining steps together
step.filters.HasMinLength(length=10) & ~step.filters.HasAllLetters(),
)
p = pipeline.Pipeline(steps)
# The updated and transformed items
updated = p.run(tags)
# ['0.9.24--ha87ae23_0', '0.9.10--hdbcaa40_3']
As expected, the above returned have length >= 10 and aren’t all letters! And technically, the pipeline above only has one step, which is generated with out custom logic. Note that for this to work, you need to chain together steps of the same type. All of the above are class BooleanStep so they will return a True or False that can be combined (&), and an outcome that we can take the inverse of (~).
Combining Pipelines
It might be the case that you want to re-use the same pipeline over again, or even include it with another pipeline! We can actually do that by just using the pipeline as a step. To start with our previous example, let’s say we turn it into some kind of check for a commit, because commits never have all letters and are usually >= 10. Maybe we want to run these preprocessing steps, split the tag by the – to remove the remainder, and then turn it into a Version we can sort.
import pipelib.steps as step
import pipelib.pipeline as pipeline
fruits = ["Orange", "Melon", "Watermelon", "Fruit23"]
preprocess = pipeline.Pipeline(
steps = (
# Example of chaining steps together
step.filters.HasMaxLength(length=8) & step.filters.HasAllLetters(),
)
)
# Add this preprocess step alongside other steps (make lowercase)
steps = (
step.transform.ToLowercase(),
preprocess,
)
# Createa a new pipeline and run
p = pipeline.Pipeline(steps)
# We should expect orange and melon!
updated = p.run(fruits)
['orange', 'melon']
Steps
The following steps are available
type |
name |
module |
description |
---|---|---|---|
filter |
HasMaxLength |
HasMaxLength |
Keep items (return true) given a maximum length |
filter |
HasMinLength |
HasMinLength |
Keep items (return true) given a minimum length |
filter |
HasAllLetters |
HasAllLetters |
Keep items with all letters (no numbers or special characters) |
filter |
HasAllLowerLettersNumbers |
HasAllLowerLettersNumbers |
Keep the string if it’s the string is only lowercase letters and numbers. |
filter |
HasPatterns |
HasPatterns |
Determine if items match a pattern of interest. |
filter |
CleanCommit |
CleanCommit |
Given a container tag that has – and _ separating some commit and version, |
transform |
SplitAndJoinN |
SplitAndJoinN |
Split a string by one delimiter, join by another. |
transform |
ToLowercase |
ToLowercase |
Convert the item to all lowercase. |
transform |
ToString |
ToString |
Convert the item to a string (typically from a wrapper) |
container |
ContainerTagSort |
ContainerTagSort |
Parse container tag versions and return a filtered and sorted set. |
sort |
BasicSort |
BasicSort |
Sort the list of items |
You can easily look at the steps that are provided:
from pipelib.steps import all_steps
In [1]: all_steps
Out[1]:
{'filter': {'HasMaxLength': pipelib.steps.filters.numeric.HasMaxLength,
'HasMinLength': pipelib.steps.filters.numeric.HasMinLength,
'HasAllLetters': pipelib.steps.filters.strings.HasAllLetters,
'HasAllLowerLettersNumbers': pipelib.steps.filters.strings.HasAllLowerLettersNumbers,
'HasPatterns': pipelib.steps.filters.strings.HasPatterns,
'CleanCommit': pipelib.steps.filters.git.CleanCommit},
'transform': {'ToInteger': pipelib.steps.transform.numeric.ToInteger,
'SplitAndJoinN': pipelib.steps.transform.strings.SplitAndJoinN,
'ToLowercase': pipelib.steps.transform.strings.ToLowercase,
'ToString': pipelib.steps.transform.strings.ToString},
'container': {'ContainerTagSort': pipelib.steps.container.tags.ContainerTagSort},
'sort': {'BasicSort': pipelib.steps.sort.basic.BasicSort},
'release': {'MajorTagSort': pipelib.steps.release.tags.MajorTagSort}}
This library is under development and we will have more documentation coming soon!