How do you use a container with a scientific filesystem? Here we will use the container built by this repository to learn this! First, obtain the container. You can build or pull it.
docker pull vanessa/example.scif
docker build -t vanessa/example.scif .
For each of the following examples, we show commands with Docker and with a Singularity container called example.simg
.
Interact with SCIF client
If we want to interact with our filesystem, we can just run the container:
$ docker run vanessa/example.scif
$ ./rnatoy
Scientific Filesystem [v0.0.71]
usage: scif [-h] [--debug] [--quiet] [--writable]
{version,pyshell,shell,preview,help,install,inspect,run,apps,dump,exec}
...
scientific filesystem tools
optional arguments:
-h, --help show this help message and exit
--debug use verbose logging to debug.
--quiet suppress print output
--writable, -w for relevant commands, if writable SCIF is needed
actions:
actions for Scientific Filesystem
{version,pyshell,shell,preview,help,install,inspect,run,apps,dump,exec}
scif actions
version show software version
pyshell Interactive python shell to scientific filesystem
shell shell to interact with scientific filesystem
preview preview changes to a filesytem
help look at help for an app, if it exists.
install install a recipe on the filesystem
inspect inspect an attribute for a scif installation
run entrypoint to run a scientific filesystem
apps list apps installed
dump dump recipe
exec execute a command to a scientific filesystem
this works because the scif is the entrypoint to the container.
Inspecting Applications
The strength of SCIF is that it will always show you the applications installed in a container, and then provide predictable commands for inspecting, running, or otherwise interacting with them. For example, if I find the container, without any prior knowledge I can reveal the applications inside:
$ docker run vanessa/example.scif apps
$ ./example.simg apps
bowtie
cufflinks
tophat
samtools
We can look at an application in detail, including asking for help:
$ docker run vanessa/example.scif help samtools
$ ./example.simg help samtools
This app provides Samtools suite
and then inspecting
$ docker run vanessa/example.scif inspect samtools
$ ./example.simg inspect samtools
{
"samtools": {
"apprun": [
" exec /usr/bin/samtools \"$@\""
],
"apphelp": [
" This app provides Samtools suite"
],
"applabels": [
"VERSION 1.7",
"URL http://www.htslib.org/"
]
}
}
The creator of the container didn’t write any complicated scripts to have this happen - the help text is just a chunk of text in a block of the recipe. The labels that are parsed to json, are also just written easily on two lines. This means that the creator can spend less time worry about exposing this. If you can write a text file, you can make your applications programatically parseable.
Interacting with Applications
I can easily shell into the container in the context of an application, meaning that the environment is sourced, etc.
$ docker run -it vanessa/example.scif shell samtools
$ ./example.simg shell samtools
[samtools] executing /bin/bash
root@d002e338b88b:/scif/apps/samtools# env | grep PATH
LD_LIBRARY_PATH=/scif/apps/samtools/lib
PATH=/scif/apps/samtools/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Notice how I’m in the app’s context (in it’s application folder) and that it’s bin is added to the path? I can also shell in without a specific application context, but still have all the SCIF global variables available to me.
$ docker run -it vanessa/example.scif shell
$ ./example.simg shell
WARNING No app selected, will run default ['/bin/bash']
executing /bin/bash
root@055a34619d17:/scif# ls
apps
data
The same kind of functionality exists with the python shell, pyshell
, but you interact directly with the scif client:
$ docker run -it vanessa/example.scif pyshell
$ ./example.simg pyshell
Found configurations for 4 scif apps
cufflinks
samtools
bowtie
tophat
[scif] /scif cufflinks | samtools | bowtie | tophat
Python 3.6.2 |Anaconda, Inc.| (default, Sep 22 2017, 02:03:08)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
client.apps()
['cufflinks', 'samtools', 'bowtie', 'tophat']
Running Applications
Before we get into creating a pipeline, look how easy it is to run an application. Without scif, we would have to have known that samtools is installed, and then executed the command to the container. But with the scientific filesystem, we discovered the app (shown above) and then we can just run it. The run
command maps to the entrypoint, as was defined by the creator:
$ docker run vanessa/example.scif run samtools
$ ./example.simg run samtools
Program: samtools (Tools for alignments in the SAM format)
Version: 0.1.18 (r982:295)
Usage: samtools <command> [options]
Command: view SAM<->BAM conversion
sort sort alignment file
mpileup multi-way pileup
depth compute the depth
faidx index/extract FASTA
tview text alignment viewer
index index alignment
idxstats BAM index stats (r595 or later)
fixmate fix mate information
flagstat simple stats
calmd recalculate MD/NM tags and '=' bases
merge merge sorted alignments
rmdup remove PCR duplicates
reheader replace BAM header
cat concatenate BAMs
targetcut cut fosmid regions (for fosmid pool only)
phase phase heterozygotes
[samtools] executing /bin/bash /scif/apps/samtools/scif/runscript
And executing any command in the context of the application is possible too:
$ docker run vanessa/example.scif exec samtools env | grep PATH
$ ./example.scif exec samtools env | grep PATH
LD_LIBRARY_PATH=/scif/apps/samtools/lib
PATH=/scif/apps/samtools/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Whether we are using Docker or Singularity, the actions going on internally with the scientific filesystem client are the same. Given a simple enough pipeline, we could stop here, and just issue a series of commands to run the different apps. But more likely you would then integrate these app entrypoints into some pipeline. If you are a developer, you may not even have a pipeline, but want to provide your software for others to use (and integrate into their pipelines!)
Now that you understand usage, take a look at the example provided in this repository.