Full Documentation
This single page repeats the full documentation for sh <https://github.com/amoffat/sh/>, making it easier to put into an LLM’s context window. There is nothing on this page that is not mentioned already elsewhere on this site, it’s just reorganized as a single page.
sh
sh is a full-fledged subprocess replacement for Python 3.8+, and PyPy that allows you to call any program as if it were a function:
from sh import git
print(git("status", "--short"))
Note that these aren’t Python functions, these are running the binary commands
on your system by dynamically resolving your $PATH, much like Bash does, and
then wrapping the binary in a function. In this way, all the programs on your
system are easily available to you from within Python.
sh relies on various Unix system calls and only works on Unix-like operating systems - Linux, macOS, BSDs etc. Specifically, Windows is not supported.
Installation
pip install sh
Quick Reference
sh.ls("-l", "/tmp", color="never")
try:
sh.ls("/doesnt/exist")
except sh.ErrorReturnCode_2:
print("directory doesn't exist")
sh.ls(_out="/tmp/dir_contents")
with open("/tmp/dir_contents", "w") as h:
sh.ls(_out=h)
from io import StringIO
buf = StringIO()
sh.ls(_out=buf)
my_ls = sh.ls.bake("-l")
# equivalent
my_ls("/tmp")
sh.ls("-l", "/tmp")
sh.wc("-l", _in=sh.ls("-1"))
# equivalent
sh.git("show", "HEAD")
sh.git.show("HEAD")
p = sh.find("-name", "sh.py", _bg=True)
# ... do other things ...
p.wait()
Single Page
The page below repeats the full documentation for sh <https://github.com/amoffat/sh/> as a single page, making it easier to put into an LLM’s context window.
Entering an SSH password
Here we will attempt to SSH into a server and enter a password programmatically.
Note
It is recommended that you just ssh-copy-id to copy your public key to
the server so you don’t need to enter your password, but for the purposes of
this demonstration, we try to enter a password.
To interact with a process, we need to assign a callback to STDOUT. The
callback signature we’ll use will take a queue.Queue object for the
second argument, and we’ll use that to send STDIN back to the process.
See also
Here’s our first attempt:
from sh import ssh
def ssh_interact(line, stdin):
line = line.strip()
print(line)
if line.endswith("password:"):
stdin.put("correcthorsebatterystaple")
ssh("10.10.10.100", _out=ssh_interact)
If you run this (substituting an IP that you can SSH to), you’ll notice that
nothing is printed from within the callback. The problem has to do with STDOUT
buffering. By default, sh line-buffers STDOUT, which means that
ssh_interact will only receive output when sh encounters a newline in the
output. This is a problem because the password prompt has no newline:
amoffat@10.10.10.100's password:
Because a newline is never encountered, nothing is sent to the ssh_interact
callback. So we need to change the STDOUT buffering. We do this with the
_out_bufsize special kwarg. We’ll set
it to 0 for unbuffered output:
from sh import ssh
def ssh_interact(line, stdin):
line = line.strip()
print(line)
if line.endswith("password:"):
stdin.put("correcthorsebatterystaple")
ssh("10.10.10.100", _out=ssh_interact, _out_bufsize=0)
If you run this updated version, you’ll notice a new problem. The output looks like this:
a
m
o
f
f
a
t
@
1
0
.
1
0
.
1
0
.
1
0
0
'
s
p
a
s
s
w
o
r
d
:
This is because the chunks of STDOUT our callback is receiving are unbuffered,
and are therefore individual characters, instead of entire lines. What we need
to do now is aggregate this character-by-character data into something more
meaningful for us to test if the pattern password: has been sent, signifying
that SSH is ready for input.
It would make sense to encapsulate the variable we’ll use for aggregating into some kind of closure or class, but to keep it simple, we’ll just use a global:
from sh import ssh
import sys
aggregated = ""
def ssh_interact(char, stdin):
global aggregated
sys.stdout.write(char.encode())
sys.stdout.flush()
aggregated += char
if aggregated.endswith("password: "):
stdin.put("correcthorsebatterystaple")
ssh("10.10.10.100", _out=ssh_interact, _out_bufsize=0)
You’ll also notice that the example still doesn’t work. There are two problems: The first is that your password must end with a newline, as if you had typed it and hit the return key. This is because SSH has no idea how long your password is, and is line-buffering STDIN.
The second problem lies deeper in SSH. SSH needs a TTY attached to its STDIN in order to work properly. This tricks SSH into believing that it is interacting with a real user in a real terminal session. To enable TTY, we can add the _tty_in special kwarg. We also need to use _unify_ttys special kwarg. This tells sh to make STDOUT and STDIN come from a single pseudo-terminal, which is a requirement of SSH:
from sh import ssh
import sys
aggregated = ""
def ssh_interact(char, stdin):
global aggregated
sys.stdout.write(char.encode())
sys.stdout.flush()
aggregated += char
if aggregated.endswith("password: "):
stdin.put("correcthorsebatterystaple\n")
ssh("10.10.10.100", _out=ssh_interact, _out_bufsize=0, _tty_in=True, _unify_ttys=True)
And now our remote login script works!
amoffat@10.10.10.100's password:
Linux 10.10.10.100 testhost #1 SMP Tue Jun 21 10:29:24 EDT 2011 i686 GNU/Linux
Ubuntu 10.04.2 LTS
Welcome to Ubuntu!
* Documentation: https://help.ubuntu.com/
66 packages can be updated.
53 updates are security updates.
Ubuntu 10.04.2 LTS
Welcome to Ubuntu!
* Documentation: https://help.ubuntu.com/
You have new mail.
Last login: Thu Sep 13 03:53:00 2012 from some.ip.address
amoffat@10.10.10.100:~$
The above process can be simplified by using a Contrib Commands. The SSH contrib command does all the ugly kwarg argument setup for you, and provides a simple but powerful interface for doing SSH password logins. Please see the SSH contrib command for more details about the exact api:
from sh.contrib import ssh
def ssh_interact(content, stdin):
sys.stdout.write(content.cur_char)
sys.stdout.flush()
# automatically logs in with password and then presents subsequent content to
# the ssh_interact callback
ssh("10.10.10.100", password="correcthorsebatterystaple", interact=ssh_interact)
Many people want to learn how to enter an SSH password by script because they want to execute remote commands on a server. Instead of trying to log in through SSH and then sending terminal input of the command to run, let’s see how we can do it another way.
First, open a terminal and run ssh-copy-id yourservername. You’ll be asked
to enter your password for the server. After entering your password, you’ll be
able to SSH into the server without needing a password again. This simplifies
things greatly for sh.
The second thing we want to do is use SSH’s ability to pass a command to run
to the server you’re SSHing to. Here’s how you can run ifconfig on a server
without having to use that server’s shell directly:
ssh amoffat@10.10.10.100 ifconfig
Translating this to sh, it becomes:
import sh
print(sh.ssh("amoffat@10.10.10.100", "ifconfig"))
We can make this even nicer by taking advantage of sh’s Baking to bind our server username/ip to a command object:
import sh
my_server = sh.ssh.bake("amoffat@10.10.10.100")
print(my_server("ifconfig"))
print(my_server("whoami"))
Now we have a reusable command object that we can use to call remote commands. But there is room for one more improvement. We can also use sh’s Sub-commands feature which expands attribute access into command arguments:
import sh
my_server = sh.ssh.bake("amoffat@10.10.10.100")
print(my_server.ifconfig())
print(my_server.whoami())
Tailing a real-time log file
sh has the ability to respond to subprocesses in an event-driven fashion. A typical example of where this would be useful is tailing a log file for a specific pattern, then responding to that value immediately:
from sh import tail
for line in tail("-f", "info.log", _iter=True):
if "ERROR" in line:
send_an_email_to_support(line)
The _iter special kwarg takes a command that would normally block until completion, and turns its output into a real-time iterable.
See also
Of course, you can do more than just tail log files. Any program that produces output can be iterated over. Say you wanted to send an email to a coworker if their C code emits a warning:
from sh import gcc, git
for line in gcc("-o", "awesome_binary", "awesome_source.c", _iter=True):
if "warning" in line:
# parse out the relevant info
filename, line, char, message = line.split(":", 3)
# find the commit using git
commit = git("blame", "-e", filename, L="%d,%d" % (line,line))
# send them an email
email_address = parse_email_from_commit_line(commit)
send_email(email_address, message)
Using _iter is a great way to respond to events from another program, but your blocks while you’re looping, making you unable to do anything else. To be truly event-driven, sh provides callbacks:
from sh import tail
def process_log_line(line):
if "ERROR" in line:
send_an_email_to_support(line)
process = tail("-f", "info.log", _out=process_log_line, _bg=True)
# ... do other stuff here ...
process.wait()
The _out special kwarg lets you to assign a callback to STDOUT.
This callback will receive each line of output from tail -f and allow you to
do the same processing that we did earlier.
See also
See also
FAQ
import sh
sh.bash("-c", "your_builtin")
Or
import sh
builtins = sh.bash.bake("-c")
builtins("your_builtin")
There are no plans to support Windows.
Use a file object opened in the mode you desire:
import sh
h = open("/tmp/output", "a")
sh.ls("/dir1", _out=h)
sh.ls("/dir2", _out=h)
Typically the reason for this is that your program detected that its STDOUT was connected to a TTY, and therefore decided to print color escape sequences in its output. The typical solution is to use _tty_out=False, which will force a pipe to be connected to STDOUT, and probably change the behavior of the program.
See also
Git is one of the programs that makes extensive use of terminal colors (as well as pagers) in its output, so we added a contrib version for convenience.
This was a design decision made for two reasons:
To make programs behave in the same way as seen on the commandline.
To provide better buffering control than pipes allow.
For #1, we want sh to produce output that is identical to what the user sees from the commandline, because that’s typically the only output they ever see from their command. This makes the output easy to understand.
For #2, using a TTY for STDOUT allows us to precisely control the buffering of a command’s output to sh’s internal code.
See also
Of course, there are some gotchas with TTY STDOUT. One of them is commands that use a pager, for example:
import sh
print(sh.git.log())
This will sometimes raise a SignalException_SIGPIPE. The reason is because
git log detects a TTY STDOUT and forks the system’s pager (typically
less) to handle the output. The pager checks for a controlling terminal,
and, finding none, exits with exit code 1. The exit of the pager means no more
readers on git log’s output, and thus a SIGPIPE is received.
One solution to the git log problem above is simply to use
_tty_out=False. Another option, specifically for git, is to use the
git --no-pager option:
import sh
print(sh.git('--no-pager', 'log'))
Glob expansion is a feature of a shell, like Bash, and is performed by the shell before passing the results to the program to be exec’d. Because sh is not a shell, but rather tool to execute programs directly, we do not handle glob expansion like a shell would.
So in order to use "*" like you would on the commandline, pass it into
glob.glob() first:
import sh
import glob
sh.ls(glob.glob("*.py"))
Use the Command() constructor to instantiate an instance of Command
directly, then execute that:
import sh
cmd = sh.Command("/path/to/command")
cmd("-v", "arg1")
If it’s in your $PATH, substitute the dash for an underscore:
import sh
sh.google_chrome("http://google.com")
The above will run google-chrome http://google.com
Note
If a program named google_chrome exists on your system, that will be
called instead. In that case, in order to execute the program with a dash
in the name, you’ll have to use the method described here.
Programs with non-alphanumeric, non-dash characters in their names cannot be executed directly as an attribute on the sh module. For example, this will not work:
import sh
sh.mkfs.ext4()
The reason should be fairly obvious. In Python, characters like . have
special meaning, in this case, attribute access. What sh is trying to do in the
above example is find the program “mkfs” (which may or may not exist) and then
perform a subcommand lookup with the name “ext4”. In other
words, it will try to call mkfs with the argument ext4, which is
probably not what you want.
The workaround is instantiating the Command Class with the string of the program you’re looking for:
import sh
mkfsext4 = sh.Command("mkfs.ext4")
mkfsext4() # run it
I prefer the syntax of sh to resemble function composition instead of a pipeline. One of the goals of sh is to make executing processes more like calling functions, not making function calls more like Bash.
There is a non-obvious reason why async piping is not possible by default. Consider the following example:
import sh
sh.cat(sh.echo("test\n1\n2\n3\n"))
When this is run, sh.echo executes and finishes, then the entire output
string is fed into sh.cat. What we would really like is each
newline-delimited chunk to flow to sh.cat incrementally.
But for this example to flow data asynchronously from echo to cat, the echo command would need to not block. But how can the inner command know the context of its execution, to know to block sometimes but not other times? It can’t know that without something explicit.
This is why the _piped special kwarg was introduced. By default, commands
executed block until they are finished, so in order for an inner command to not
block, _piped=True signals to the inner command that it should not block.
This way, the inner command starts running, then very shortly after, the outer
command starts running, and both are running simultaneously. Data can then flow
from the inner command to the outer command asynchronously:
import sh
sh.cat(sh.echo("test\n1\n2\n3\n", _piped=True))
Again, this example is contrived – a better example would be a long-running command that produces a lot of output that you wish to pipe through another program incrementally.
There are two ways to do this
See also
You can use sys.stdin, sys.stdout, and sys.stderr as
arguments to _in, _out, _err, respectively, and it should
mostly work as expected:
import sh
import sys
sh.your_command(_in=sys.stdin, _out=sys.stdout)
There are a few reasons why this probably won’t work. The first reason is that
sys.stdin is probably a controlling TTY (attached to the shell that
launched the python process), and probably not set in raw mode
termios(3), which means that, among other things, input is buffered
by newlines.
The real solution is to use _fg=True:
import sh
sh.top(_fg=True)
This confuses many new sh users. They want to do something like this and expect it to just work:
from sh import tar
tar("cvf /tmp/test.tar /my/home/directory")
But instead they’ll get a confusing error message:
RAN: '/bin/tar cvf /tmp/test.tar /my/home/directory'
STDOUT:
STDERR:
/bin/tar: Old option 'f' requires an argument.
Try '/bin/tar --help' or '/bin/tar --usage' for more information.
The reason why they expect it to work is because shells, like Bash, automatically
parse your commandline and break up arguments for you, before sending them to
the binary. They have a complex set of rules (some of which are represented by
shlex) to take a single string of a command and arguments and separate
them.
Even if we wanted to implement this in sh (which we don’t), it would hurt the ability for users to parameterize parts of their arguments. They would have to use string interpolation, which would be ugly and error prone:
from sh import tar
tar("cvf %s %s" % ("/tmp/tar1.tar", "/home/oh no a space")
In the above example, "/home/oh", "no", "a", and "space" would
all be separate arguments to tar, causing the program to behave unexpectedly.
Basically every command with parameterized arguments would need to expect
characters that could break the parser.
Typically this question gets asked when a user is trying to execute something like the following commandline:
my-command --arg1=val1 arg2 --arg3=val3
This is usually the first attempt that they make:
sh.my_command(arg1="val1", "arg2", arg3="val3")
This doesn’t work because, in Python, position arguments, like arg2 cannot
come after keyword arguments.
Furthermore, it is entirely possible that --arg3=val3 comes before
--arg1=val1. The reason for this is that a function’s **kwargs is an
unordered mapping, and so key-value pairs are not guaranteed to resolve to a
specific order.
So the solution here is to forego the usage of the keyword argument convenience, and just use raw ordered arguments:
sh.my_command("--arg1=val1", "arg2", "--arg3=val3")
Pylint complains with E1101 no-member to almost all sh.command invocations,
because it doesn’t know, that these members are generated dynamically.
Starting with Pylint 1.6 these messages can be suppressed using generated-members option.
Just add following lines to pylintrc:
[TYPECHECK]
generated-members=sh
sh can be patched in your tests the typical way, with
unittest.mock.patch():
from unittest.mock import patch
import sh
def get_something():
return sh.pwd()
@patch("sh.pwd", create=True)
def test_something(pwd):
pwd.return_value = "/"
assert get_something() == "/"
The important thing to note here is that create=True is set. This is
required because sh is a bit magical and patch will fail to find the pwd
command as an attribute on the sh module.
You may also patch the Command class:
from unittest.mock import patch
import sh
def get_something():
pwd = sh.Command("pwd")
return pwd()
@patch("sh.Command")
def test_something(Command):
Command().return_value = "/"
assert get_something() == "/"
Notice here we do not need create=True, because Command is not an
automatically generated object on the sh module (it actually exists).
When sh was first written, the design decision was made to make it a single-file module. This has pros and cons:
Cons:
Auditing the code is more challenging
Without file-enforced structure, adding more features and abstractions makes the code harder to follow
Cognitively, it feels cluttered
Pros:
Can be used easily on systems without Python package managers
Can be embedded/bundled together with other software more easily
Cognitively, it feels more self-contained
In my mind, because the primary target audience of sh users is generally more scrappy devops, systems people, or people just trying to stitch together some clunky system programs, the listed pros weigh a little more heavily than the cons. Sacrificing some development advantages to give those users a more flexible tool is a win to me.
Down the road, the development disadvantages of a single file can be solved with additional development tools, for example, with a tool that compiles multiple modules into the single sh.py file. Realistically, though, sh is pretty mature, so I don’t see it growing much more in complexity or code size.
Use logging:
import logging
import sh
logging.basicConfig(level=logging.INFO)
sh.ls()
INFO:sh.command:<Command '/bin/ls'>: starting process
INFO:sh.command:<Command '/bin/ls', pid 32394>: process started
INFO:sh.command:<Command '/bin/ls', pid 32394>: process completed
...
Contrib Commands
Contrib is an sh sub-module that provides friendly wrappers to useful commands. Typically, the commands being wrapped are unintuitive, and the contrib version makes them intuitive.
Note
Contrib commands should be considered generally unstable. They will grow and change as the community figures out the best interface for them.
Commands
Allows you to enter your password from the terminal at runtime, or as a string in your script.
- sudo(password=None, *args, **kwargs)
Call sudo with
password, if specified, else ask the executing user for a password at runtime viagetpass.getpass().
See also
Many git commands use a pager for output, which can cause an unexpected behavior
when run through sh. To account for this, the contrib version sets
_tty_out=False for all git commands.
- git(*args, **kwargs)
Call git with STDOUT connected to a pipe, instead of a TTY.
from sh.contrib import git
repo_log = git.log()
Added in version 1.13.0.
SSH password-based logins can be a pain. This contrib command performs all of the ugly setup and provides a clean interface to using SSH.
- ssh(interact=None, password=None, prompt_match=None, login_success=None, *args, **kwargs)
- Parameters:
interact – A callback to handle SSH session interaction after login is successful. Required.
password – A password string or a function that returns a password string. Optional. If not provided,
getpass.getpass()is used.prompt_match – The string to match in order to determine when to provide SSH with the password. Or a function that matches on the output. Optional.
login_success – A function to determine if SSH login is successful. Optional.
The interact parameter takes a callback with a signature that is slightly different to the function callbacks for
redirection:
- fn(content, stdin_queue)
- Parameters:
content – An instance of an ephemeral SessionContent class whose job is to hold the characters that the SSH session has written to STDOUT.
stdin_queue – A
queue.Queueobject to communicate with STDIN programmatically.
password can be simply a string that will be used to type the password. If it’s not provided, it will be read from STDIN
at runtime via getpass.getpass(). It can also be a callable that returns the password string.
prompt_match is a string to match before the contrib command will provide the SSH process with the password. It is
optional, and if left unspecified, will default to “password: “. It can also be a callable that is called on a
SessionContent instance and returns True or False for a match.
login_success is a function that takes a SessionContent object and returns a boolean for
whether or not a successful login occurred. It is optional, and if unspecified, simply evaluates to True, meaning
any password submission results in a successful login (obviously not always correct). It is recommended that you specify
this.
- class SessionContent
This class contains a record lines and characters written to the SSH processes’s STDOUT. It should be all you need from the callbacks to determine how to interact with the SSH process.
- SessionContent.chars
- Type:
The previous 50,000 characters.
- SessionContent.lines
- Type:
The previous 5,000 lines.
Bash —
Often users may find themselves having to run bash commands directly, whether due
to commands having special characters (e.g. dash, or dot) or other reasons.
This can lead into recurrently having to bake the bash command to call it directly. To
account for this, the contrib version provides a bash command baked in:
- bash(*args, **kwargs)
Call bash with the prefix of “bash -c […]”.
from sh.contrib import bash
# Calling commands directly
bash.ls() # equivalent to "bash -c ls"
# Or adding the full commands
bash("command-with-dashes args")
Extending
For developers.
To extend contrib, simply decorate a function in sh with the @contrib
decorator, and pass in the name of the command you wish to shadow to the
decorator. This method must return an instance of Command:
@contrib("ls")
def my_ls(original):
ls = original.bake("-l")
return ls
Now you can run your custom contrib command from your scripts, and you’ll be using the command returned from your decorated function:
from sh.contrib import ls
# executing: ls -l
print(ls("/"))
For even more flexibility, you can design your contrib command to rewrite its options based on executed arguments. For example, say you only wish to set a command’s argument if another argument is set. You can accomplish it like this:
@contrib("ls")
def my_ls(original):
def process(args, kwargs):
if "-a" in args:
args.append("-L")
return args, kwargs
ls = original.bake("-l")
return ls, process
Returning a process function along with the command will tell sh to use that function to preprocess the arguments at execution time using the _arg_preprocess special kwarg.
Using Sudo
There are 3 ways of using sudo to execute commands in your script. These
are listed in order of usefulness and security. In most cases, you should just
use a variation of sh.contrib.sudo.
Because sudo is so frequently used, we have added a contrib version of the
command to make sudo usage more intuitive. This contrib version is simply a
wrapper around the sh.sudo raw command, but we bake in some
special keyword argument to make it well-behaved. In
particular, the contrib version allows you to specify your password at execution
time via terminal input, or as a string in your script.
Via a with context:
import sh
with sh.contrib.sudo:
print(ls("/root"))
Or alternatively via subcommands:
import sh
print(sh.contrib.sudo.ls("/root"))
Output:
[sudo] password for youruser: *************
your_root_files.txt
In the above example, sh.contrib.sudo automatically asks you for a password
using getpass.getpass() under the hood.
This method is the most secure, because it lowers the chances of doing something insecure, like including your password in your python script, or by saying that a particular user can execute anything inside of a particular script (the NOPASSWD method).
Note
sh.contrib.sudo does not do password caching like the sudo binary does.
Thie means that each time a sudo command is run in your script, you will be
asked to type in a password.
You may also specify your password to sh.contrib.sudo as a string:
import sh
password = get_your_password()
with sh.contrib.sudo(password=password, _with=True):
print(ls("/root"))
Warning
This method is less secure because it becomes tempting to hard-code your password into the python script, and that’s a bad idea. However, it is more flexible, because it allows you to obtain your password from another source, so long as the end result is a string.
With this method, you can use the raw sh.sudo command directly, because
you’re being guaranteed that the system will not ask you for a password. It
first requires you set up your user to have root execution privileges
Edit your sudoers file:
$> sudo visudo
Add or edit the line describing your user’s permissions:
yourusername ALL = (root) NOPASSWD: /path/to/your/program
This says yourusername on ALL hosts will be able to run as root, but
only root (root) (no other users), and that no password NOPASSWD will be
asked of /path/to/your/program.
Warning
This method can be insecure if an unprivileged user can edit your script, because the entire script will be exited as a privileged user. A malicious user could put something bad in this script.
Using the raw command sh.sudo (which resolves directly to the system’s
sudo binary) without NOPASSWD is possible, provided you wire up the special
keyword arguments on your own to make it behave correctly. This method is
discussed generally for educational purposes; if you take the time to wire up
sh.sudo on your own, then you have in essence just recreated
sh.contrib.sudo.
import sh
# password must end in a newline
my_password = "password\n"
# -S says "get the password from stdin"
my_sudo = sh.sudo.bake("-S", _in=my_password)
print(my_sudo.ls("root"))
Another less-obvious way of using sudo is by executing the raw sh.sudo
command but also putting it in the foreground. This way, sudo will work
correctly automatically, by hooking up stdin/out/err automatically, and by
asking you for a password if it requires one. The downsides of using
_fg=True, however, are that you cannot capture its output – everything is
just printed to your terminal as if you ran it from a shell.
import sh
sh.sudo.ls("/root", _fg=True)
Migrating from 1.x to 2.x
This document provides an upgrade path from 1.* to 2.*.
sh.cd builtin removed
There is no sh.cd command anymore. It was always a command implemented in
sh, as some systems provide it as a shell builtin, while others have an actual
binary. But neither of them persisted the directory change between other sh
calls, which is why it was implemented in sh.
If you were using sh.cd(dir), use the context manager with sh.pushd(dir)
instead. All of the commands in the managed context will have the correct
directory.
Removed execution contexts / default arguments
In 1.* you could spawn a new module from the sh module, one which had
customized defaults for the special keyword arguments. This module could then be
accessed just like sh, and you could even import commands from it.
Unfortunately the magic required to make that work was brittle. Also it was not aligned syntactically with the similar baking concept. We have therefore changed the syntax to align with baking, and also removed the ability to import directly from this new baked execution context.
sh2 = sh(_tty_out=False)
sh2.ls()
Becomes:
sh2 = sh.bake(_tty_out=False)
sh2.ls()
And:
sh2 = sh.bake(_tty_out=False)
from sh2 import ls
ls()
Becomes:
sh2 = sh.bake(_tty_out=False)
ls = sh2.ls
ls()
Return value now a true string
In 2.*, the return value of an executed sh command has changed (in most
cases) from a RunningCommand object to a unicode string. This makes using
the output of a command more natural.
To continue returning a RunningCommand object, use the _return_cmd=True
special keyword argument. You can achieve this on each file with the following
code at the top of files that use sh:
import sh
sh = sh.bake(_return_cmd=True)
Piping to STDIN
Previously, if the first argument of a sh command was an instance of
RunningCommand, it was automatically fed into the process’s STDIN. This is
no longer the case and you must explicitly use _in=.
from sh import wc, ls
print(wc(ls("/home/<user>", "-l"), "-l"))
Becomes:
from sh import wc, ls
print(wc("-l", _in=ls("/home/<user>", "-l")))
Or:
from sh import wc, ls
print(wc("-l", _in=ls("/home/<user>", "-l", _return_cmd=True)))
None.
New processes don’t launch in new session
In 1.*, _new_session defaulted to True. It now defaults to
False. The reason for this is that it makes more sense for launched
processes to default to being in the process group of the Python script, so
that they receive SIGINTs correctly.
To preserve the old behavior:
import sh
sh = sh.bake(_new_session=True)
Passing Arguments
When passing multiple arguments to a command, each argument must be a separate string:
from sh import tar
tar("cvf", "/tmp/test.tar", "/my/home/directory/")
This will not work:
from sh import tar
tar("cvf /tmp/test.tar /my/home/directory")
sh supports short-form -a and long-form --arg arguments as
keyword arguments:
# resolves to "curl http://duckduckgo.com/ -o page.html --silent"
curl("http://duckduckgo.com/", o="page.html", silent=True)
# or if you prefer not to use keyword arguments, this does the same thing:
curl("http://duckduckgo.com/", "-o", "page.html", "--silent")
# resolves to "adduser amoffat --system --shell=/bin/bash --no-create-home"
adduser("amoffat", system=True, shell="/bin/bash", no_create_home=True)
# or
adduser("amoffat", "--system", "--shell", "/bin/bash", "--no-create-home")
See also
Exit Codes & Exceptions
Normal processes exit with exit code 0. This can be seen from
RunningCommand.exit_code:
output = ls("/", _return_cmd=True)
print(output.exit_code) # should be 0
If a process terminates, and the exit code is not 0, an exception is generated
dynamically. This lets you catch a specific return code, or catch all error
return codes through the base class ErrorReturnCode:
try:
print(ls("/some/non-existent/folder"))
except ErrorReturnCode_2:
print("folder doesn't exist!")
create_the_folder()
except ErrorReturnCode:
print("unknown error")
You can also customize which exit codes indicate an error with _ok_code. For example:
for i in range(10):
sh.grep("string to check", f"file_{i}.txt", _ok_code=(0, 1))
where the _ok_code makes a failure to find a match a no-op.
Signals are raised whenever your process terminates from a signal. The exception raised in this situation is SignalException, which subclasses ErrorReturnCode.
try:
p = sh.sleep(3, _bg=True)
p.kill()
except sh.SignalException_SIGKILL:
print("killed")
This behavior could be blocked by appending the negative value of the signal to
_ok_code. All signals that raises SignalException are [SIGABRT,
SIGBUS, SIGFPE, SIGILL, SIGINT, SIGKILL, SIGPIPE, SIGQUIT, SIGSEGV, SIGTERM,
SIGTERM].
Note
You can catch SignalException by using either a number or a signal name. For example, the following two exception classes are equivalent:
assert sh.SignalException_SIGKILL == sh.SignalException_9
Redirection
sh can redirect the STDOUT and STDERR of a process to many different types of targets, using the _out and _err special kwargs.
If a string is used, it is assumed to be a filename. The filename is opened as “wb”, meaning truncate-write and binary mode.
import sh
sh.ifconfig(_out="/tmp/interfaces")
See also
You may also use any object that supports .write(data), like
io.StringIO:
import sh
from io import StringIO
buf = StringIO()
sh.ifconfig(_out=buf)
print(buf.getvalue())
A callback function may also be used as a target. The function must conform to one of three signatures:
- fn(data)
The function takes just the chunk of data from the process.
- fn(data, stdin_queue)
In addition to the previous signature, the function also takes a
queue.Queue, which may be used to communicate programmatically with the process.
- fn(data, stdin_queue, process)
In addition to the previous signature, the function takes a
weakref.refto the OProc object.
See also
See also
Asynchronous Execution
sh provides a few methods for running commands and obtaining output in a non-blocking fashion.
AsyncIO
Added in version 2.0.0.
Sh supports asyncio on commands with the _async=True special
kwarg. This let’s you incrementally await output produced from your command.
import asyncio
import sh
async def main():
await sh.sleep(3, _async=True)
asyncio.run(main())
Incremental Iteration
You may also create asynchronous commands by iterating over them with the _iter special kwarg. This creates an iterable (specifically, a generator) that you can loop over:
from sh import tail
# runs forever
for line in tail("-f", "/var/log/some_log_file.log", _iter=True):
print(line)
By default, _iter iterates over STDOUT, but you can change set this
specifically by passing either "err" or "out" to _iter (instead of
True). Also by default, output is line-buffered, so the body of the loop
will only run when your process produces a newline. You can change this by
changing the buffer size of the command’s output with _out_bufsize.
Note
If you need a fully non-blocking iterator, use _iter_noblock. If
the current iteration would block, errno.EWOULDBLOCK will be
returned, otherwise you’ll receive a chunk of output, as normal.
Background Processes
By default, each running command blocks until completion. If you have a long-running command, you can put it in the background with the _bg=True special kwarg:
# blocks
sleep(3)
print("...3 seconds later")
# doesn't block
p = sleep(3, _bg=True)
print("prints immediately!")
p.wait()
print("...and 3 seconds later")
You’ll notice that you need to call RunningCommand.wait() in order to exit
after your command exits.
Commands launched in the background ignore SIGHUP, meaning that when their
controlling process (the session leader, if there is a controlling terminal)
exits, they will not be signalled by the kernel. But because sh commands launch
their processes in their own sessions by default, meaning they are their own
session leaders, ignoring SIGHUP will normally have no impact. So the only
time ignoring SIGHUP will do anything is if you use _new_session=False, in which case the controlling process will probably be the shell
from which you launched python, and exiting that shell would normally send a
SIGHUP to all child processes.
See also
For more information on the exact launch process, see Architecture Overview.
In combination with _bg=True, sh can use callbacks to process output incrementally by passing a callable function to _out and/or _err. This callable will be called for each line (or chunk) of data that your command outputs:
from sh import tail
def process_output(line):
print(line)
p = tail("-f", "/var/log/some_log_file.log", _out=process_output, _bg=True)
p.wait()
To control whether the callback receives a line or a chunk, use
_out_bufsize. To “quit” your callback, simply return True. This
tells the command not to call your callback anymore.
The line or chunk received by the callback can either be of type str or
bytes. If the output could be decoded using the provided encoding, a
str will be passed to the callback, otherwise it would be raw bytes.
Note
Returning True does not kill the process, it only keeps the callback
from being called again. See Interactive callbacks for how to kill a
process from a callback.
See also
Commands may communicate with the underlying process interactively through a
specific callback signature
Each command launched through sh has an internal STDIN queue.Queue
that can be used from callbacks:
def interact(line, stdin):
if line == "What... is the air-speed velocity of an unladen swallow?":
stdin.put("What do you mean? An African or European swallow?")
elif line == "Huh? I... I don't know that....AAAAGHHHHHH":
cross_bridge()
return True
else:
stdin.put("I don't know....AAGGHHHHH")
return True
p = sh.bridgekeeper(_out=interact, _bg=True)
p.wait()
Note
If you use a queue, you can signal the end of the input (EOF) with None
You can also kill or terminate your process (or send any signal, really) from your callback by adding a third argument to receive the process object:
def process_output(line, stdin, process):
print(line)
if "ERROR" in line:
process.kill()
return True
p = tail("-f", "/var/log/some_log_file.log", _out=process_output, _bg=True)
The above code will run, printing lines from some_log_file.log until the
word "ERROR" appears in a line, at which point the tail process will be
killed and the script will end.
Note
You may also use RunningCommand.terminate() to send a SIGTERM, or
RunningCommand.signal() to send a general signal.
A done callback called when the process exits, either normally (through a success or error exit code) or through a signal. It is always called.
Here’s an example of using _done to create a multiprocess pool, where
sh.your_parallel_command is executed concurrently at no more than 10 at a
time:
import sh
from threading import Semaphore
pool = Semaphore(10)
def done(cmd, success, exit_code):
pool.release()
def do_thing(arg):
pool.acquire()
return sh.your_parallel_command(arg, _bg=True, _done=done)
procs = []
for arg in range(100):
procs.append(do_thing(arg))
# essentially a join
[p.wait() for p in procs]
Baking
sh is capable of “baking” arguments into commands. This is essentially
partial application,
like you might do with functools.partial().
from sh import ls
ls = ls.bake("-la")
print(ls) # "/usr/bin/ls -la"
# resolves to "ls -la /"
print(ls("/"))
The idea here is that now every call to ls will have the “-la” arguments
already specified. Baking can become very useful when you combine it with
Sub-commands:
from sh import ssh
# calling whoami on a server. this is a lot to type out, especially if
# you wanted to call many commands (not just whoami) back to back on
# the same server
iam1 = ssh("myserver.com", "-p 1393", "whoami")
# wouldn't it be nice to bake the common parameters into the ssh command?
myserver = ssh.bake("myserver.com", p=1393)
print(myserver) # "/usr/bin/ssh myserver.com -p 1393"
# resolves to "/usr/bin/ssh myserver.com -p 1393 whoami"
iam2 = myserver.whoami()
assert(iam1 == iam2) # True!
Now that the “myserver” callable represents a baked ssh command, you can call anything on the server easily:
# executes "/usr/bin/ssh myserver.com -p 1393 tail /var/log/dumb_daemon.log -n 100"
print(myserver.tail("/var/log/dumb_daemon.log", n=100))
Piping
Bash style piping is performed using function composition. Just pass one
command as the input to another’s _in argument, and sh will send the output of
the inner command to the input of the outer command:
# sort this directory by biggest file
print(sort("-rn", _in=du(glob("*"), "-sb")))
# print(the number of folders and files in /etc
print(wc("-l", _in=ls("/etc", "-1")))
Note
This basic piping does not flow data through asynchronously; the inner command blocks until it finishes, before sending its data to the outer command.
By default, any command that is piping another command in waits for it to complete. This behavior can be changed with the _piped special kwarg on the command being piped, which tells it not to complete before sending its data, but to send its data incrementally. Read ahead for examples of this.
By default, all piped commands execute sequentially. What this means is that the inner command executes first, then sends its data to the outer command:
print(wc("-l", _in=ls("/etc", "-1")))
In the above example, ls executes, gathers its output, then sends that output
to wc. This is fine for simple commands, but for commands where you need
parallelism, this isn’t good enough. Take the following example:
for line in tr(_in=tail("-f", "test.log"), "[:upper:]", "[:lower:]", _iter=True):
print(line)
This won’t work because the tail -f command never finishes. What you
need is for tail to send its output to tr as it receives it. This is where
the _piped special kwarg comes in handy:
for line in tr(_in=tail("-f", "test.log", _piped=True), "[:upper:]", "[:lower:]", _iter=True):
print(line)
This works by telling tail -f that it is being used in a pipeline, and that
it should send its output line-by-line to tr. By default, _piped sends
STDOUT, but you can easily make it send STDERR instead by using _piped="err"
Sub-commands
Many programs have their own command subsets, like git (branch, checkout), svn (update, status), and sudo (where any command following sudo is considered a sub-command). sh handles subcommands through attribute access:
from sh import git, sudo
# resolves to "git branch -v"
print(git.branch("-v"))
print(git("branch", "-v")) # the same command
# resolves to "sudo /bin/ls /root"
print(sudo.ls("/root"))
print(sudo("/bin/ls", "/root")) # the same command
Sub-commands are mainly syntax sugar that makes calling some programs look conceptually nicer.
See also
If you’re using sudo as a subcommand, please be sure to see Using Sudo.
Default Arguments
Many times, you want to override the default arguments of all commands launched
through sh. For example, suppose you want the output of all commands to be
aggregated into a io.StringIO buffer. The naive way would be this:
import sh
from io import StringIO
buf = StringIO()
sh.ls("/", _out=buf)
sh.whoami(_out=buf)
sh.ps("auxwf", _out=buf)
Clearly, this gets tedious quickly. Fortunately, we can create execution contexts that allow us to set default arguments on all commands spawned from that context:
import sh
from io import StringIO
buf = StringIO()
sh2 = sh.bake(_out=buf)
sh2.ls("/")
sh2.whoami()
sh2.ps("auxwf")
Now, anything launched from sh2 will send its output to the StringIO
instance buf.
Environments
The _env special kwarg allows you to pass a dictionary of environment variables and their corresponding values:
import sh
sh.google_chrome(_env={"SOCKS_SERVER": "localhost:1234"})
_env replaces your process’s environment completely. Only the key-value pairs in _env will be used for its environment. If you want to add new environment variables for a process in addition to your existing environment, try something like this:
import os
import sh
new_env = os.environ.copy()
new_env["SOCKS_SERVER"] = "localhost:1234"
sh.google_chrome(_env=new_env)
See also
To make an environment apply to all sh commands look into Default Arguments.
Input via STDIN
STDIN is sent to a process directly by using a command’s _in special kwarg:
print(cat(_in="test"))
Any command that takes input from STDIN can be used this way:
print(tr("[:lower:]", "[:upper:]", _in="sh is awesome"))
You’re also not limited to using just strings. You may use a file object, a
queue.Queue, or any iterable (list, set, dictionary, etc):
stdin = ["sh", "is", "awesome"]
out = tr("[:lower:]", "[:upper:]", _in=stdin)
Note
If you use a queue, you can signal the end of the queue (EOF) with None
‘With’ Contexts
Commands can be run within a Python with context. Popular commands using
this might be sudo or fakeroot:
with sh.contrib.sudo:
print(ls("/root"))
See also
If you need to run a command in a with context and pass in arguments, for example, specifying a -p prompt with sudo, you need to use the _with=True This let’s the command know that it’s being run from a with context so it can behave correctly:
with sh.contrib.sudo(k=True, _with=True):
print(ls("/root"))
Special Kwargs
These arguments alter a command’s behavior. They are not passed to the program. You can use them on any command that you run, but some may not be used together. sh will tell you if there are conflicts.
To set default special keyword arguments on every command run, you may use Default Arguments.
Controlling Output
Default value: None
What to redirect STDOUT to. If this is a string, it will be treated as a file
name. You may also pass a file object (or file-like object), an int
(representing a file descriptor, like the result of os.pipe()), a
io.StringIO object, or a callable.
import sh
sh.ls(_out="/tmp/output")
See also
Default value: None
What to redirect STDERR to. See _out.
Default value: False
If True, duplicate the file descriptor bound to the process’s STDOUT also to
STDERR, effectively causing STDERR and STDOUT to go to the same place.
Default value: sh.DEFAULT_ENCODING
The character encoding of the process’s STDOUT. By default, this is the locale’s default encoding.
Added in version 1.07.0.
Default value: "strict"
This is how Python should handle decoding errors of the process’s output.
By default, this is "strict", but you can use any value that’s valid
to bytes.decode(), such as "ignore".
Added in version 1.07.0.
Default value: None
As of 1.07.0, any time redirection is used, either for STDOUT or STDERR, the
respective internal buffers are not filled. For example, if you’re downloading
a file and using a callback on STDOUT, the internal STDOUT buffer, nor the pipe
buffer be filled with data from STDOUT. This option forces one of stderr
(_tee='err') or stdout (_tee='out' or _tee=True) to be filled
anyways, in effect “tee-ing” the output into two places (the callback/redirect
handler, and the internal buffers).
Added in version 1.12.0.
Default value: True
Whether or not exception output should be truncated.
Execution
Added in version 1.12.0.
Default value: False
Runs a command in the foreground, meaning it is spawned using os.spawnle(). The current process’s STDIN/OUT/ERR
is os.dup2()’d to the new process and so the new process becomes the foreground of the shell executing the
script. This is only really useful when you want to launch a lean, interactive process that sh is having trouble
running, for example, ssh.
Warning
_fg=True side-steps a lot of sh’s functionality. You will not be returned a process object and most (likely
all) other special kwargs will not work.
If you are looking for similar functionality, but still retaining sh’s features, use the following:
import sh
import sys
sh.your_command(_in=sys.stdin, _out=sys.stdout, _err=sys.stderr)
Default value: False
Runs a command in the background. The command will return immediately, and you
will have to run RunningCommand.wait() on it to ensure it terminates.
See also
Added in version 1.12.9.
Default value: True
Automatically report exceptions for the background command. If you set this to
False you should make sure to call RunningCommand.wait() or you may
swallow exceptions that happen in the background command.
Added in version 2.0.0.
Default value: False
Allows your command to become awaitable. Use in combination with _iter
and async for to incrementally await output as it is produced.
Default value: None
A dictionary defining the only environment variables that will be made accessible to the process. If not specified, the calling process’s environment variables are used.
Note
This dictionary is the authoritative environment for the process. If you wish to change a single variable in your current environment, you must pass a copy of your current environment with the overridden variable to sh.
See also
Default value: None
How much time, in seconds, we should give the process to complete. If the process does not finish within the timeout, it will be sent the signal defined by _timeout_signal.
Default value: signal.SIGKILL
The signal to be sent to the process if _timeout is not None.
Default value: None
A string that sets the current working directory of the process.
Default value: 0
Either an integer, a list, or a tuple containing the exit code(s) that are considered “ok”, or in other words: do not raise an exception. Some misbehaved programs use exit codes other than 0 to indicate success.
import sh
sh.weird_program(_ok_code=[0,3,5])
If the process is killed by a signal, a SignalException is raised by default. This behavior could be blocked by appending a negative number to _ok_code that represents the signal.
import sh
# the process won't raise SignalException if SIGINT, SIGKILL, or SIGTERM
# are sent to kill the process
p = sh.sleep(3, _bg=True, _ok_code=[0, -2, -9, -15])
# No exception will be raised here
p.kill()
See also
Default value: False
Determines if our forked process will be executed in its own session via
os.setsid().
Changed in version 2.0.0: The default value of _new_session was changed from True to False
because it makes more sense for a launched process to default to being in
the process group of python script, so that it receives SIGINTs correctly.
See also
Default value: False
Determines if our forked process will be executed in its own group via os.setpgid()
Added in version 1.12.0.
Default value: None
The user id to assume before the child process calls os.execv().
Added in version 1.12.0.
Default value: None
A function to be run directly before the child process calls os.execv().
Typically not used by normal users.
Added in version 1.13.0.
Default value: {} (empty set)
A whitelist iterable of integer file descriptors to be inherited by the child. Passing anything in this argument causes _close_fds to be True.
Added in version 1.13.0.
Default value: True
Causes all inherited file descriptors besides stdin, stdout, and stderr to be automatically closed. This option is automatically enabled when _pass_fds is given a value.
Communication
Default value: None
Specifies an argument for the process to use as its standard input. This may be
a string, a queue.Queue, a file-like object, or any iterable.
See also
Default value: None
May be True, "out", or "err". Signals a command that it is being
used as the input to another command, so it should return its output
incrementally as it receives it, instead of aggregating it all at once.
See also
Default value: None
May be True, "out", or "err". Puts a command in iterable mode. In
this mode, you can use a for or while loop to iterate over a command’s
output in real-time.
import sh
for line in sh.cat("/tmp/file", _iter=True):
print(line)
See also
Default value: None
Same as _iter, except the loop will not block if there is no
output to iterate over. Instead, the output from the command will be
errno.EWOULDBLOCK.
import sh
import errno
import time
for line in sh.tail("-f", "stuff.log", _iter_noblock=True):
if line == errno.EWOULDBLOCK:
print("doing something else...")
time.sleep(0.5)
else:
print("processing line!")
See also
Default value: False
Explicitly tells us that we’re running a command in a with context. This is
only necessary if you’re using a command in a with context and passing
parameters to it.
import sh
with sh.contrib.sudo(password="abc123", _with=True):
print(sh.ls("/root"))
See also
Added in version 1.11.0.
Default value: None
A callback that is always called when the command completes, even if it completes with an exit code that would raise an exception. After the callback is run, any exception that would be raised is raised.
The callback is passed the RunningCommand instance, a boolean indicating success, and the exit code.
Here’s an example of using _done to create a multiprocess pool, where
sh.your_parallel_command is executed concurrently at no more than 10 at a
time:
import sh
from threading import Semaphore
pool = Semaphore(10)
def done(cmd, success, exit_code):
pool.release()
def do_thing(arg):
pool.acquire()
return sh.your_parallel_command(arg, _bg=True, _done=done)
procs = []
for arg in range(100):
procs.append(do_thing(arg))
# essentially a join
[p.wait() for p in procs]
TTYs
Default value: False, meaning a os.pipe() will be used.
If True, sh creates a TTY for STDIN, essentially emulating a terminal, as if
your command was entered from the commandline. This is necessary for commands
that require STDIN to be a TTY.
Default value: True
If True, sh creates a TTY for STDOUT, otherwise use a os.pipe(). This
is necessary for commands that require STDOUT to be a TTY.
See also
Added in version 1.13.0.
Default value: False
If True, sh will combine the STDOUT and STDIN TTY into a single
pseudo-terminal. This is sometimes required by picky programs which expect to be
dealing with a single pseudo-terminal, like SSH.
See also
Default value: (20, 80)
The (rows, columns) of stdout’s TTY. Changing this may affect how much your program prints per line, for example.
Performance & Optimization
Default value: 0
The STDIN buffer size. 0 for unbuffered, 1 for line buffered, anything else for a buffer of that amount.
Default value: 1
The STDOUT buffer size. 0 for unbuffered, 1 for line buffered, anything else for a buffer of that amount.
Default value: 1
Same as _out_bufsize, but with STDERR.
Default value: 3 * 1024**2 chunks
How much of STDOUT/ERR your command will store internally. This value represents the number of bufsize chunks not the total number of bytes. For example, if this value is 100, and STDOUT is line buffered, you will be able to retrieve 100 lines from STDOUT. If STDOUT is unbuffered, you will be able to retrieve only 100 characters.
Added in version 1.07.0.
Default value: False
Disables STDOUT being internally stored. This is useful for commands that produce huge amounts of output that you don’t need, that would otherwise be hogging memory if stored internally by sh.
Added in version 1.07.0.
Default value: False
Disables STDERR being internally stored. This is useful for commands that produce huge amounts of output that you don’t need, that would otherwise be hogging memory if stored internally by sh.
Added in version 1.07.0.
Default value: False
Similar to _no_out, this explicitly tells the sh command that it will never
be used for piping its output into another command, so it should not fill its
internal pipe buffer with the process’s output. This is also useful for
conserving memory.
Program Arguments
These are options that affect how command options are fed into the program.
Added in version 1.12.0.
Default value: "="
This is the character(s) that separate a program’s long argument’s key from the
value, when using kwargs to specify your program’s long arguments. For example,
if your program expects a long argument in the form --name value, the way to
achieve this would be to set _long_sep=" ".
import sh
sh.your_program(key=value, _long_sep=" ")
Would send the following list of arguments to your program:
["--key value"]
If your program expects the long argument name to be separate from its value,
pass None into _long_sep instead:
import sh
sh.your_program(key=value, _long_sep=None)
Would send the following list of arguments to your program:
["--key", "value"]
Added in version 1.12.0.
Default value: "--"
This is the character(s) that prefix a long argument for the program being run. Some programs use single dashes, for example, and do not understand double dashes.
Added in version 1.12.0.
Default value: None
This is an advanced option that allows you to rewrite a command’s arguments on the fly, based on other command arguments, or some other variable. It is really only useful in conjunction with baking, and only currently used when constructing contrib wrappers.
Example:
import sh
def processor(args, kwargs):
return args, kwargs
my_ls = sh.bake.ls(_arg_preprocess=processor)
Warning
The interface to the _arg_preprocess function may change without
warning. It is generally only for internal sh use, so don’t use it unless
you absolutely have to.
Misc
Default value: None
Added in version 1.12.0.
This allows for a custom logging header for Command Class instances. For example, the default logging looks like this:
import logging
import sh
logging.basicConfig(level=logging.INFO)
sh.ls("-l")
INFO:sh.command:<Command '/bin/ls -l'>: starting process
INFO:sh.command:<Command '/bin/ls -l', pid 28952>: process started
INFO:sh.command:<Command '/bin/ls -l', pid 28952>: process completed
People can find this <Command .. section long and not relevant. _log_msg allows you to customize this:
import logging
import sh
logging.basicConfig(level=logging.INFO)
def custom_log(ran, call_args, pid=None):
return ran
sh.ls("-l", _log_msg=custom_log)
INFO:sh.command:/bin/ls -l: starting process
INFO:sh.command:/bin/ls -l: process started
INFO:sh.command:/bin/ls -l: process completed
The first argument, ran, is the program’s execution string and arguments, as close as we can get it to be how you’d
type in the shell. call_args is a dictionary of all of the special kwargs that were passed to the command. And pid
is the process id of the forked process. It defaults to None because the _log_msg callback is actually called
twice: first to construct the logger for the RunningCommand Class instance, before the process itself is spawned, then
a second time after the process is spawned via OProc Class, when we have a pid.
Architecture Overview
Launch
When it comes time to launch a process
Open pipes and/or TTYs STDIN/OUT/ERR.
Open a pipe for communicating pre-exec exceptions from the child to the parent.
Open a pipe for child/parent launch synchronization.
os.fork()a child process.
From here, we have two concurrent processes running:
If _bg=True is set, we ignore
signal.SIGHUP.If _new_session=True, become a session leader with
os.setsid(), else become a process group leader withos.setpgrp().Write our session id to the a pipe connected to the parent. This is mainly to synchronize with our parent that our session/group logic has finished.
os.dup2()the file descriptors of our previously-setup TTYs/pipes to our STDIN/OUT/ERR file descriptors.If we’re a session leader and our STDIN is a TTY, via _tty_in=True, acquire a controlling terminal, thereby becoming the controlling process of the session.
Set our GID/UID if we’ve set a custom one via _uid.
Close all file descriptors greater than STDERR.
Call
os.execv().
Check for any exceptions via the exception pipe connected to the child.
Block and read our child’s session id from a pipe connected to the child. This synchronizes to us that the child has finished moving between sessions/groups and we can now accurately determine its current session id and process group.
If we’re using a TTY for STDIN, via _tty_in=True, disable echoing on the TTY, so that data sent to STDIN is not echoed to STDOUT.
Running
An instance of OProc Class contains two internal threads, one for STDIN, and one for STDOUT and STDERR. The purpose of these threads is to handle reading/writing to the read/write ends of the process’s standard descriptors.
For example, the STDOUT/ERR thread continually runs select.select() on the
master ends of the TTYs/pipes connected to STDOUT/ERR, and if they’re ready to
read, reads the available data and aggregates it into the appropriate place.
A couple of different buffers must be considered when thinking about how data flows through an sh process.
The first buffer is the buffer associated with the underlying pipe or TTY attached to STDOUT/ERR. In the case of a TTY (the default for output), the buffer size is 0, so output is immediate – a byte written by the process is a byte received by sh. For a pipe, however, the buffer size of the pipe is typically 4-64kb. pipe(2).
See also
The second buffer is sh’s internal buffers, one for STDOUT and one for STDERR. These buffers aggregate data that has been read from the master end of the TTY or pipe attached to the output fd, but before that data is sent along to the appropriate output handler (queue, file object, function, etc). Data sits in these buffers until we reach the size specified with _internal_bufsize, at which point the buffer flushes to the output handler.
Exit
On process completion, our internal threads must complete, as the read end of STDIN, for example, which is connected to the process, is no longer open, so writing to the slave end will no longer work.
The STDOUT/ERR thread is a little more complicated, because although the process
is not alive, output data may still exist in the pipe/TTY buffer that must be
collected. So we essentially just select.select() on the read ends until
they return nothing, indicating that they are complete, then we break out of our
read loop.
The exit code is obtained from the reaped process. If the process ended from a signal, the exit code is the negative value of that signal. For example, SIGKILL would result in an exit code -9.
If specified, the _done callback is executed with the RunningCommand instance, a boolean indicating success, and the adjusted exit code. After the callback returns, error processing continues. In other words, the done callback is called regardless of success or failure, and there’s nothing it can do to prevent the ErrorReturnCode exceptions from being raised after it completes.
API
Command Class
The Command class represents a program that exists on the system and can be
run at some point in time. An instance of Command is never running; an
instance of RunningCommand is spawned for that.
An instance of Command can take the form of a manually instantiated object,
or as an object instantiated by dynamic lookup:
import sh
ls1 = sh.Command("ls")
ls2 = sh.ls
assert ls1 == ls2
- class Command(name, search_paths=None)
Instantiates a Command instance, where name is the name of a program that exists on the user’s
$PATH, or is a full path itself. If search_paths is specified, it must be a list of all the paths to look for the program name.from sh import Command ifconfig = Command("ifconfig") ifconfig = Command("/sbin/ifconfig")
- Command.bake(*args, **kwargs)
Returns a new Command with
*argsand**kwargsbaked in as positional and keyword arguments, respectively. Any future calls to the returned Command will include*argsand**kwargsautomatically:from sh import ls long_ls = ls.bake("-l") print(ls("/var")) print(ls("/tmp"))
See also
Similar to the above, arguments to the sh.Command must be separate.
e.g. the following does not work:
lscmd = sh.Command("/bin/ls -l")
tarcmd = sh.Command("/bin/tar cvf /tmp/test.tar /my/home/directory/")
You will run into CommandNotFound(path) exception even when correct full path is specified.
The correct way to do this is to :
build
Commandobject using only the binarypass the arguments to the object when invoking
as follows:
lscmd = sh.Command("/bin/ls")
lscmd("-l")
tarcmd = sh.Command("/bin/tar")
tarcmd("cvf", "/tmp/test.tar", "/my/home/directory/")
RunningCommand Class
This represents a Command instance that has been
or is being executed. It exists as a wrapper around the low-level OProc. Most of your interaction with sh objects are with instances of
this class. It is only returned if _return_cmd=True when you execute a command.
Warning
Objects of this class behave very much like strings. This was an intentional design decision to make the “output” of an executing Command behave more intuitively.
Be aware that functions that accept real strings only, for example
json.dumps, will not work on instances of RunningCommand, even though it
look like a string.
- RunningCommand.wait(timeout=None)
- Parameters:
timeout – An optional non-negative number to wait for the command to complete. If it doesn’t complete by the timeout, we raise TimeoutException.
Block and wait for the command to finish execution and obtain an exit code. If the exit code represents a failure, we raise the appropriate exception. See exceptions.
Note
Calling this method multiple times only yields an exception on the first call.
This is called automatically by sh unless your command is being executed asynchronously, in which case, you may want to call this manually to ensure completion.
If an instance of Command is being used as the stdin argument (see piping),
wait()is also called on that instance, and any exceptions resulting from that process are propagated up.
- RunningCommand.stdout
A
@propertythat callswait()and then returns the contents of what the process wrote to stdout.
- RunningCommand.stderr
A
@propertythat callswait()and then returns the contents of what the process wrote to stderr.
- RunningCommand.pid
The process id of the process.
- RunningCommand.sid
The session id of the process. This will typically be a different session than the current python process, unless _new_session=False was specified.
- RunningCommand.pgid
The process group id of the process.
- RunningCommand.ctty
The controlling terminal device, if there is one.
- RunningCommand.signal(sig_num)
Sends sig_num to the process. Typically used with a value from the
signalmodule, likesignal.SIGHUP(see signal(7)).
- RunningCommand.signal_group(sig_num)
Sends sig_num to every process in the process group. Typically used with a value from the
signalmodule, likesignal.SIGHUP(see signal(7)).
- RunningCommand.terminate()
Shortcut for
RunningCommand.signal(signal.SIGTERM).
- RunningCommand.kill()
Shortcut for
RunningCommand.signal(signal.SIGKILL).
- RunningCommand.kill_group()
Shortcut for
RunningCommand.signal_group(signal.SIGKILL).
OProc Class
Warning
Don’t use instances of this class directly. It is being documented here for posterity, not for direct use.
- OProc.wait()
Block until the process completes, aggregate the output, and populate
OProc.exit_code.
- OProc.stdout
A
collections.deque, sized to _internal_bufsize items, that contains the process’s STDOUT.
- OProc.stderr
A
collections.deque, sized to _internal_bufsize items, that contains the process’s STDERR.
- OProc.exit_code
Contains the process’s exit code, or
Noneif the process has not yet exited.
- OProc.pid
The process id of the process.
- OProc.sid
The session id of the process. This will typically be a different session than the current python process, unless _new_session=False was specified.
- OProc.pgid
The process group id of the process.
- OProc.ctty
The controlling terminal device, if there is one.
- OProc.signal(sig_num)
Sends sig_num to the process. Typically used with a value from the
signalmodule, likesignal.SIGHUP(see signal(7)).
- OProc.signal_group(sig_num)
Sends sig_num to every process in the process group. Typically used with a value from the
signalmodule, likesignal.SIGHUP(see signal(7)).
- OProc.terminate()
Shortcut for
OProc.signal(signal.SIGTERM).
- OProc.kill()
Shortcut for
OProc.signal(signal.SIGKILL).
- OProc.kill_group()
Shortcut for
OProc.signal_group(signal.SIGKILL).
Exceptions
- class ErrorReturnCode
This is the base class for, as the name suggests, error return codes. It subclasses
Exception.
- ErrorReturnCode.full_cmd
The full command that was executed, as a string, so that you can try it on the commandline if you wish.
- ErrorReturnCode.stdout
The total aggregated STDOUT for the process.
- ErrorReturnCode.stderr
The total aggregated STDERR for the process.
- ErrorReturnCode.exit_code
The process’s adjusted exit code.
See also
Subclasses ErrorReturnCode. Raised when a command receives a signal that causes it to exit.
Raised when a command specifies a non-null _timeout and the command times out:
import sh
try:
sh.sleep(10, _timeout=1)
except sh.TimeoutException:
print("we timed out, as expected")
Also raised when you specify a timeout to RunningCommand.wait(timeout=None):
import sh
p = sh.sleep(10, _bg=True)
try:
p.wait(timeout=1)
except sh.TimeoutException:
print("we timed out waiting")
p.kill()
This exception is raised in one of the following conditions:
The program cannot be found on your path.
You do not have permissions to execute the program.
The program is not marked executable.
The last two bullets may seem strange, but they fall in line with how a shell like Bash behaves when looking up a program to execute.
Note
CommandNotFound subclasses AttributeError. As such, the repr of it is simply the name of the missing
attribute.
Helper Functions
- which(name, search_paths=None)
Resolves name to program’s absolute path, or
Noneif it cannot be found. If search_paths is list of paths, use that list to look for the program, otherwise use the environment variable$PATH.
- pushd(directory)
This function provides a
withcontext that behaves similar to Bash’s pushd by pushing to the provided directory, and popping out of it at the end of the context.import sh with sh.pushd("/tmp"): sh.touch("a_file")
Note
It should be noted that we use a reentrant lock, so that different threads using this function will have the correct behavior inside of their
withcontexts.