Full Documentation

This single page repeats the full documentation for sh <https://github.com/amoffat/sh/>, making it easier to put into an LLM’s context window. There is nothing on this page that is not mentioned already elsewhere on this site, it’s just reorganized as a single page.

sh

sh is a full-fledged subprocess replacement for Python 3.10+ and PyPy that allows you to call any program as if it were a function:

from sh import git
print(git("status", "--short"))

Note that these aren’t Python functions, these are running the binary commands on your system by dynamically resolving your $PATH, much like Bash does, and then wrapping the binary in a function. In this way, all the programs on your system are easily available to you from within Python.

sh relies on various Unix system calls and only works on Unix-like operating systems - Linux, macOS, BSDs etc. Specifically, Windows is not supported.

Installation

pip install sh

Quick Reference

sh.ls("-l", "/tmp", color="never")

try:
    sh.ls("/doesnt/exist")
except sh.ErrorReturnCode_2:
    print("directory doesn't exist")

sh.ls(_out="/tmp/dir_contents")

with open("/tmp/dir_contents", "w") as h:
    sh.ls(_out=h)

from io import StringIO
buf = StringIO()
sh.ls(_out=buf)

my_ls = sh.ls.bake("-l")

# equivalent
my_ls("/tmp")
sh.ls("-l", "/tmp")

sh.wc("-l", _in=sh.ls("-1"))

# equivalent
sh.git("show", "HEAD")
sh.git.show("HEAD")

p = sh.find("-name", "sh.py", _bg=True)
# ... do other things ...
p.wait()

Single Page

The page below repeats the full documentation for sh <https://github.com/amoffat/sh/> as a single page, making it easier to put into an LLM’s context window.

Full Documentation

Entering an SSH password

Here we will attempt to SSH into a server and enter a password programmatically.

Note

It is recommended that you just ssh-copy-id to copy your public key to the server so you don’t need to enter your password, but for the purposes of this demonstration, we try to enter a password.

To interact with a process, we need to assign a callback to STDOUT. The callback signature we’ll use will take a queue.Queue object for the second argument, and we’ll use that to send STDIN back to the process.

See also

Function Callback

Here’s our first attempt:

from sh import ssh

def ssh_interact(line, stdin):
    line = line.strip()
    print(line)
    if line.endswith("password:"):
        stdin.put("correcthorsebatterystaple")

ssh("10.10.10.100", _out=ssh_interact)

If you run this (substituting an IP that you can SSH to), you’ll notice that nothing is printed from within the callback. The problem has to do with STDOUT buffering. By default, sh line-buffers STDOUT, which means that ssh_interact will only receive output when sh encounters a newline in the output. This is a problem because the password prompt has no newline:

amoffat@10.10.10.100's password:

Because a newline is never encountered, nothing is sent to the ssh_interact callback. So we need to change the STDOUT buffering. We do this with the _out_bufsize special kwarg. We’ll set it to 0 for unbuffered output:

from sh import ssh

def ssh_interact(line, stdin):
    line = line.strip()
    print(line)
    if line.endswith("password:"):
        stdin.put("correcthorsebatterystaple")

ssh("10.10.10.100", _out=ssh_interact, _out_bufsize=0)

If you run this updated version, you’ll notice a new problem. The output looks like this:

a
m
o
f
f
a
t
@
1
0
.
1
0
.
1
0
.
1
0
0
'
s

p
a
s
s
w
o
r
d
:

This is because the chunks of STDOUT our callback is receiving are unbuffered, and are therefore individual characters, instead of entire lines. What we need to do now is aggregate this character-by-character data into something more meaningful for us to test if the pattern password: has been sent, signifying that SSH is ready for input.

It would make sense to encapsulate the variable we’ll use for aggregating into some kind of closure or class, but to keep it simple, we’ll just use a global:

from sh import ssh
import sys

aggregated = ""
def ssh_interact(char, stdin):
    global aggregated
    sys.stdout.write(char.encode())
    sys.stdout.flush()
    aggregated += char
    if aggregated.endswith("password: "):
        stdin.put("correcthorsebatterystaple")

ssh("10.10.10.100", _out=ssh_interact, _out_bufsize=0)

You’ll also notice that the example still doesn’t work. There are two problems: The first is that your password must end with a newline, as if you had typed it and hit the return key. This is because SSH has no idea how long your password is, and is line-buffering STDIN.

The second problem lies deeper in SSH. SSH needs a TTY attached to its STDIN in order to work properly. This tricks SSH into believing that it is interacting with a real user in a real terminal session. To enable TTY, we can add the _tty_in special kwarg. We also need to use _unify_ttys special kwarg. This tells sh to make STDOUT and STDIN come from a single pseudo-terminal, which is a requirement of SSH:

from sh import ssh
import sys

aggregated = ""
def ssh_interact(char, stdin):
    global aggregated
    sys.stdout.write(char.encode())
    sys.stdout.flush()
    aggregated += char
    if aggregated.endswith("password: "):
        stdin.put("correcthorsebatterystaple\n")

ssh("10.10.10.100", _out=ssh_interact, _out_bufsize=0, _tty_in=True, _unify_ttys=True)

And now our remote login script works!

amoffat@10.10.10.100's password:
Linux 10.10.10.100 testhost #1 SMP Tue Jun 21 10:29:24 EDT 2011 i686 GNU/Linux
Ubuntu 10.04.2 LTS

Welcome to Ubuntu!
 * Documentation:  https://help.ubuntu.com/

66 packages can be updated.
53 updates are security updates.

Ubuntu 10.04.2 LTS

Welcome to Ubuntu!
 * Documentation:  https://help.ubuntu.com/
You have new mail.
Last login: Thu Sep 13 03:53:00 2012 from some.ip.address
amoffat@10.10.10.100:~$

The above process can be simplified by using a Contrib Commands. The SSH contrib command does all the ugly kwarg argument setup for you, and provides a simple but powerful interface for doing SSH password logins. Please see the SSH contrib command for more details about the exact api:

from sh.contrib import ssh

def ssh_interact(content, stdin):
    sys.stdout.write(content.cur_char)
    sys.stdout.flush()

# automatically logs in with password and then presents subsequent content to
# the ssh_interact callback
ssh("10.10.10.100", password="correcthorsebatterystaple", interact=ssh_interact)

Many people want to learn how to enter an SSH password by script because they want to execute remote commands on a server. Instead of trying to log in through SSH and then sending terminal input of the command to run, let’s see how we can do it another way.

First, open a terminal and run ssh-copy-id yourservername. You’ll be asked to enter your password for the server. After entering your password, you’ll be able to SSH into the server without needing a password again. This simplifies things greatly for sh.

The second thing we want to do is use SSH’s ability to pass a command to run to the server you’re SSHing to. Here’s how you can run ifconfig on a server without having to use that server’s shell directly:

ssh amoffat@10.10.10.100 ifconfig

Translating this to sh, it becomes:

import sh

print(sh.ssh("amoffat@10.10.10.100", "ifconfig"))

We can make this even nicer by taking advantage of sh’s Baking to bind our server username/ip to a command object:

import sh

my_server = sh.ssh.bake("amoffat@10.10.10.100")
print(my_server("ifconfig"))
print(my_server("whoami"))

Now we have a reusable command object that we can use to call remote commands. But there is room for one more improvement. We can also use sh’s Sub-commands feature which expands attribute access into command arguments:

import sh

my_server = sh.ssh.bake("amoffat@10.10.10.100")
print(my_server.ifconfig())
print(my_server.whoami())

Tailing a real-time log file

sh has the ability to respond to subprocesses in an event-driven fashion. A typical example of where this would be useful is tailing a log file for a specific pattern, then responding to that value immediately:

from sh import tail

for line in tail("-f", "info.log", _iter=True):
    if "ERROR" in line:
        send_an_email_to_support(line)

The _iter special kwarg takes a command that would normally block until completion, and turns its output into a real-time iterable.

See also

Incremental Iteration

Of course, you can do more than just tail log files. Any program that produces output can be iterated over. Say you wanted to send an email to a coworker if their C code emits a warning:

from sh import gcc, git

for line in gcc("-o", "awesome_binary", "awesome_source.c", _iter=True):
    if "warning" in line:
        # parse out the relevant info
        filename, line, char, message = line.split(":", 3)

        # find the commit using git
        commit = git("blame", "-e", filename, L="%d,%d" % (line,line))

        # send them an email
        email_address = parse_email_from_commit_line(commit)
        send_email(email_address, message)

Using _iter is a great way to respond to events from another program, but your blocks while you’re looping, making you unable to do anything else. To be truly event-driven, sh provides callbacks:

from sh import tail

def process_log_line(line):
    if "ERROR" in line:
        send_an_email_to_support(line)

process = tail("-f", "info.log", _out=process_log_line, _bg=True)

# ... do other stuff here ...

process.wait()

The _out special kwarg lets you to assign a callback to STDOUT. This callback will receive each line of output from tail -f and allow you to do the same processing that we did earlier.

See also

Output Callbacks

See also

Redirection

FAQ

import sh

sh.bash("-c", "your_builtin")

Or

import sh

builtins = sh.bash.bake("-c")
builtins("your_builtin")

There are no plans to support Windows.

Use a file object opened in the mode you desire:

import sh

h = open("/tmp/output", "a")

sh.ls("/dir1", _out=h)
sh.ls("/dir2", _out=h)

Typically the reason for this is that your program detected that its STDOUT was connected to a TTY, and therefore decided to print color escape sequences in its output. The typical solution is to use _tty_out=False, which will force a pipe to be connected to STDOUT, and probably change the behavior of the program.

See also

Git is one of the programs that makes extensive use of terminal colors (as well as pagers) in its output, so we added a contrib version for convenience.

This was a design decision made for two reasons:

To make programs behave in the same way as seen on the commandline.
To provide better buffering control than pipes allow.

For #1, we want sh to produce output that is identical to what the user sees from the commandline, because that’s typically the only output they ever see from their command. This makes the output easy to understand.

For #2, using a TTY for STDOUT allows us to precisely control the buffering of a command’s output to sh’s internal code.

See also

Buffers

Of course, there are some gotchas with TTY STDOUT. One of them is commands that use a pager, for example:

import sh
print(sh.git.log())

This will sometimes raise a SignalException_SIGPIPE. The reason is because git log detects a TTY STDOUT and forks the system’s pager (typically less) to handle the output. The pager checks for a controlling terminal, and, finding none, exits with exit code 1. The exit of the pager means no more readers on git log’s output, and thus a SIGPIPE is received.

One solution to the git log problem above is simply to use _tty_out=False. Another option, specifically for git, is to use the git --no-pager option:

import sh
print(sh.git('--no-pager', 'log'))

Glob expansion is a feature of a shell, like Bash, and is performed by the shell before passing the results to the program to be exec’d. Because sh is not a shell, but rather tool to execute programs directly, we do not handle glob expansion like a shell would.

So in order to use "*" like you would on the commandline, pass it into glob.glob() first:

import sh
import glob
sh.ls(glob.glob("*.py"))

Use the Command() constructor to instantiate an instance of Command directly, then execute that:

import sh
cmd = sh.Command("/path/to/command")
cmd("-v", "arg1")

If it’s in your $PATH, substitute the dash for an underscore:

import sh
sh.google_chrome("http://google.com")

The above will run google-chrome http://google.com

Note

If a program named google_chrome exists on your system, that will be called instead. In that case, in order to execute the program with a dash in the name, you’ll have to use the method described here.

Programs with non-alphanumeric, non-dash characters in their names cannot be executed directly as an attribute on the sh module. For example, this will not work:

import sh
sh.mkfs.ext4()

The reason should be fairly obvious. In Python, characters like . have special meaning, in this case, attribute access. What sh is trying to do in the above example is find the program “mkfs” (which may or may not exist) and then perform a subcommand lookup with the name “ext4”. In other words, it will try to call mkfs with the argument ext4, which is probably not what you want.

The workaround is instantiating the Command Class with the string of the program you’re looking for:

import sh
mkfsext4 = sh.Command("mkfs.ext4")
mkfsext4() # run it

I prefer the syntax of sh to resemble function composition instead of a pipeline. One of the goals of sh is to make executing processes more like calling functions, not making function calls more like Bash.

There is a non-obvious reason why async piping is not possible by default. Consider the following example:

import sh

sh.cat(sh.echo("test\n1\n2\n3\n"))

When this is run, sh.echo executes and finishes, then the entire output string is fed into sh.cat. What we would really like is each newline-delimited chunk to flow to sh.cat incrementally.

But for this example to flow data asynchronously from echo to cat, the echo command would need to not block. But how can the inner command know the context of its execution, to know to block sometimes but not other times? It can’t know that without something explicit.

This is why the _piped special kwarg was introduced. By default, commands executed block until they are finished, so in order for an inner command to not block, _piped=True signals to the inner command that it should not block. This way, the inner command starts running, then very shortly after, the outer command starts running, and both are running simultaneously. Data can then flow from the inner command to the outer command asynchronously:

import sh

sh.cat(sh.echo("test\n1\n2\n3\n", _piped=True))

Again, this example is contrived – a better example would be a long-running command that produces a lot of output that you wish to pipe through another program incrementally.

There are two ways to do this

See also

_fg

You can use sys.stdin, sys.stdout, and sys.stderr as arguments to _in, _out, _err, respectively, and it should mostly work as expected:

import sh
import sys
sh.your_command(_in=sys.stdin, _out=sys.stdout)

There are a few reasons why this probably won’t work. The first reason is that sys.stdin is probably a controlling TTY (attached to the shell that launched the python process), and probably not set in raw mode termios(3), which means that, among other things, input is buffered by newlines.

The real solution is to use _fg=True:

import sh
sh.top(_fg=True)

This confuses many new sh users. They want to do something like this and expect it to just work:

from sh import tar
tar("cvf /tmp/test.tar /my/home/directory")

But instead they’ll get a confusing error message:

RAN: '/bin/tar cvf /tmp/test.tar /my/home/directory'

STDOUT:

STDERR:
/bin/tar: Old option 'f' requires an argument.
Try '/bin/tar --help' or '/bin/tar --usage' for more information.

The reason why they expect it to work is because shells, like Bash, automatically parse your commandline and break up arguments for you, before sending them to the binary. They have a complex set of rules (some of which are represented by shlex) to take a single string of a command and arguments and separate them.

Even if we wanted to implement this in sh (which we don’t), it would hurt the ability for users to parameterize parts of their arguments. They would have to use string interpolation, which would be ugly and error prone:

from sh import tar
tar("cvf %s %s" % ("/tmp/tar1.tar", "/home/oh no a space")

In the above example, "/home/oh", "no", "a", and "space" would all be separate arguments to tar, causing the program to behave unexpectedly. Basically every command with parameterized arguments would need to expect characters that could break the parser.

Typically this question gets asked when a user is trying to execute something like the following commandline:

my-command --arg1=val1 arg2 --arg3=val3

This is usually the first attempt that they make:

sh.my_command(arg1="val1", "arg2", arg3="val3")

This doesn’t work because, in Python, position arguments, like arg2 cannot come after keyword arguments.

Furthermore, it is entirely possible that --arg3=val3 comes before --arg1=val1. The reason for this is that a function’s **kwargs is an unordered mapping, and so key-value pairs are not guaranteed to resolve to a specific order.

So the solution here is to forego the usage of the keyword argument convenience, and just use raw ordered arguments:

sh.my_command("--arg1=val1", "arg2", "--arg3=val3")

Pylint complains with E1101 no-member to almost all sh.command invocations, because it doesn’t know, that these members are generated dynamically. Starting with Pylint 1.6 these messages can be suppressed using generated-members option.

Just add following lines to pylintrc:

[TYPECHECK]
generated-members=sh

sh can be patched in your tests the typical way, with unittest.mock.patch():

from unittest.mock import patch
import sh

def get_something():
    return sh.pwd()

@patch("sh.pwd", create=True)
def test_something(pwd):
    pwd.return_value = "/"
    assert get_something() == "/"

The important thing to note here is that create=True is set. This is required because sh is a bit magical and patch will fail to find the pwd command as an attribute on the sh module.

You may also patch the Command class:

from unittest.mock import patch
import sh

def get_something():
    pwd = sh.Command("pwd")
    return pwd()

@patch("sh.Command")
def test_something(Command):
    Command().return_value = "/"
    assert get_something() == "/"

Notice here we do not need create=True, because Command is not an automatically generated object on the sh module (it actually exists).

When sh was first written, the design decision was made to make it a single-file module. This has pros and cons:

Cons:

Auditing the code is more challenging
Without file-enforced structure, adding more features and abstractions makes the code harder to follow
Cognitively, it feels cluttered

Pros:

Can be used easily on systems without Python package managers
Can be embedded/bundled together with other software more easily
Cognitively, it feels more self-contained

In my mind, because the primary target audience of sh users is generally more scrappy devops, systems people, or people just trying to stitch together some clunky system programs, the listed pros weigh a little more heavily than the cons. Sacrificing some development advantages to give those users a more flexible tool is a win to me.

Down the road, the development disadvantages of a single file can be solved with additional development tools, for example, with a tool that compiles multiple modules into the single sh.py file. Realistically, though, sh is pretty mature, so I don’t see it growing much more in complexity or code size.

Use logging:

import logging
import sh

logging.basicConfig(level=logging.INFO)
sh.ls()

INFO:sh.command:<Command '/bin/ls'>: starting process
INFO:sh.command:<Command '/bin/ls', pid 32394>: process started
INFO:sh.command:<Command '/bin/ls', pid 32394>: process completed
...

Contrib Commands

Contrib is an sh sub-module that provides friendly wrappers to useful commands. Typically, the commands being wrapped are unintuitive, and the contrib version makes them intuitive.

Note

Contrib commands should be considered generally unstable. They will grow and change as the community figures out the best interface for them.

Commands

Allows you to enter your password from the terminal at runtime, or as a string in your script.

sudo(password=None, *args, **kwargs): Call sudo with password, if specified, else ask the executing user for a password at runtime via getpass.getpass().

See also

sh.contrib.sudo

Many git commands use a pager for output, which can cause an unexpected behavior when run through sh. To account for this, the contrib version sets _tty_out=False for all git commands.

git(*args, **kwargs): Call git with STDOUT connected to a pipe, instead of a TTY.

from sh.contrib import git
repo_log = git.log()

Added in version 1.13.0.

SSH password-based logins can be a pain. This contrib command performs all of the ugly setup and provides a clean interface to using SSH.

ssh(interact=None, password=None, prompt_match=None, login_success=None, *args, **kwargs)

Parameters:

interact – A callback to handle SSH session interaction after login is successful. Required.
password – A password string or a function that returns a password string. Optional. If not provided, getpass.getpass() is used.
prompt_match – The string to match in order to determine when to provide SSH with the password. Or a function that matches on the output. Optional.
login_success – A function to determine if SSH login is successful. Optional.

The interact parameter takes a callback with a signature that is slightly different to the function callbacks for redirection:

fn(content, stdin_queue)

Parameters:

content – An instance of an ephemeral SessionContent class whose job is to hold the characters that the SSH session has written to STDOUT.
stdin_queue – A queue.Queue object to communicate with STDIN programmatically.

password can be simply a string that will be used to type the password. If it’s not provided, it will be read from STDIN at runtime via getpass.getpass(). It can also be a callable that returns the password string.

prompt_match is a string to match before the contrib command will provide the SSH process with the password. It is optional, and if left unspecified, will default to “password: “. It can also be a callable that is called on a SessionContent instance and returns True or False for a match.

login_success is a function that takes a SessionContent object and returns a boolean for whether or not a successful login occurred. It is optional, and if unspecified, simply evaluates to True, meaning any password submission results in a successful login (obviously not always correct). It is recommended that you specify this.

class SessionContent: This class contains a record lines and characters written to the SSH processes’s STDOUT. It should be all you need from the callbacks to determine how to interact with the SSH process.

SessionContent.chars

Type:: collections.deque

The previous 50,000 characters.

SessionContent.lines

Type:: collections.deque

The previous 5,000 lines.

SessionContent.line_chars

Type:: list

The characters in the line currently being aggregated.

SessionContent.cur_line

Type:: str

A string of the line currently being aggregated.

SessionContent.last_line

Type:: str

The previous line.

SessionContent.cur_char

Type:: str

The currently written character.

Bash —

Often users may find themselves having to run bash commands directly, whether due to commands having special characters (e.g. dash, or dot) or other reasons. This can lead into recurrently having to bake the bash command to call it directly. To account for this, the contrib version provides a bash command baked in:

bash(*args, **kwargs): Call bash with the prefix of “bash -c […]”.

from sh.contrib import bash

# Calling commands directly
bash.ls() # equivalent to "bash -c ls"

# Or adding the full commands
bash("command-with-dashes args")

Extending

For developers.

To extend contrib, simply decorate a function in sh with the @contrib decorator, and pass in the name of the command you wish to shadow to the decorator. This method must return an instance of Command:

@contrib("ls")
def my_ls(original):
    ls = original.bake("-l")
    return ls

Now you can run your custom contrib command from your scripts, and you’ll be using the command returned from your decorated function:

from sh.contrib import ls

# executing: ls -l
print(ls("/"))

For even more flexibility, you can design your contrib command to rewrite its options based on executed arguments. For example, say you only wish to set a command’s argument if another argument is set. You can accomplish it like this:

@contrib("ls")
def my_ls(original):
    def process(args, kwargs):
        if "-a" in args:
            args.append("-L")
        return args, kwargs

    ls = original.bake("-l")
    return ls, process

Returning a process function along with the command will tell sh to use that function to preprocess the arguments at execution time using the _arg_preprocess special kwarg.

Using Sudo

There are 3 ways of using sudo to execute commands in your script. These are listed in order of usefulness and security. In most cases, you should just use a variation of sh.contrib.sudo.

Because sudo is so frequently used, we have added a contrib version of the command to make sudo usage more intuitive. This contrib version is simply a wrapper around the sh.sudo raw command, but we bake in some special keyword argument to make it well-behaved. In particular, the contrib version allows you to specify your password at execution time via terminal input, or as a string in your script.

Via a with context:

import sh

with sh.contrib.sudo:
    print(ls("/root"))

Or alternatively via subcommands:

import sh
print(sh.contrib.sudo.ls("/root"))

Output:

[sudo] password for youruser: *************
your_root_files.txt

In the above example, sh.contrib.sudo automatically asks you for a password using getpass.getpass() under the hood.

This method is the most secure, because it lowers the chances of doing something insecure, like including your password in your python script, or by saying that a particular user can execute anything inside of a particular script (the NOPASSWD method).

Note

sh.contrib.sudo does not do password caching like the sudo binary does. Thie means that each time a sudo command is run in your script, you will be asked to type in a password.

You may also specify your password to sh.contrib.sudo as a string:

import sh

password = get_your_password()

with sh.contrib.sudo(password=password, _with=True):
    print(ls("/root"))

Warning

This method is less secure because it becomes tempting to hard-code your password into the python script, and that’s a bad idea. However, it is more flexible, because it allows you to obtain your password from another source, so long as the end result is a string.

With this method, you can use the raw sh.sudo command directly, because you’re being guaranteed that the system will not ask you for a password. It first requires you set up your user to have root execution privileges

Edit your sudoers file:

$> sudo visudo

Add or edit the line describing your user’s permissions:

yourusername ALL = (root) NOPASSWD: /path/to/your/program

This says yourusername on ALL hosts will be able to run as root, but only root (root) (no other users), and that no password NOPASSWD will be asked of /path/to/your/program.

Warning

This method can be insecure if an unprivileged user can edit your script, because the entire script will be exited as a privileged user. A malicious user could put something bad in this script.

Using the raw command sh.sudo (which resolves directly to the system’s sudo binary) without NOPASSWD is possible, provided you wire up the special keyword arguments on your own to make it behave correctly. This method is discussed generally for educational purposes; if you take the time to wire up sh.sudo on your own, then you have in essence just recreated sh.contrib.sudo.

import sh

# password must end in a newline
my_password = "password\n"

# -S says "get the password from stdin"
my_sudo = sh.sudo.bake("-S", _in=my_password)

print(my_sudo.ls("root"))

Another less-obvious way of using sudo is by executing the raw sh.sudo command but also putting it in the foreground. This way, sudo will work correctly automatically, by hooking up stdin/out/err automatically, and by asking you for a password if it requires one. The downsides of using _fg=True, however, are that you cannot capture its output – everything is just printed to your terminal as if you ran it from a shell.

import sh
sh.sudo.ls("/root", _fg=True)

Migrating from 1.x to 2.x

This document provides an upgrade path from 1.* to 2.*.

`sh.cd` builtin removed

There is no sh.cd command anymore. It was always a command implemented in sh, as some systems provide it as a shell builtin, while others have an actual binary. But neither of them persisted the directory change between other sh calls, which is why it was implemented in sh.

If you were using sh.cd(dir), use the context manager with sh.pushd(dir) instead. All of the commands in the managed context will have the correct directory.

Removed execution contexts / default arguments

In 1.* you could spawn a new module from the sh module, one which had customized defaults for the special keyword arguments. This module could then be accessed just like sh, and you could even import commands from it.

Unfortunately the magic required to make that work was brittle. Also it was not aligned syntactically with the similar baking concept. We have therefore changed the syntax to align with baking, and also removed the ability to import directly from this new baked execution context.

sh2 = sh(_tty_out=False)
sh2.ls()

Becomes:

sh2 = sh.bake(_tty_out=False)
sh2.ls()

And:

sh2 = sh.bake(_tty_out=False)
from sh2 import ls
ls()

Becomes:

sh2 = sh.bake(_tty_out=False)
ls = sh2.ls
ls()

Return value now a true string

In 2.*, the return value of an executed sh command has changed (in most cases) from a RunningCommand object to a unicode string. This makes using the output of a command more natural.

To continue returning a RunningCommand object, use the _return_cmd=True special keyword argument. You can achieve this on each file with the following code at the top of files that use sh:

import sh

sh = sh.bake(_return_cmd=True)

Piping to STDIN

Previously, if the first argument of a sh command was an instance of RunningCommand, it was automatically fed into the process’s STDIN. This is no longer the case and you must explicitly use _in=.

from sh import wc, ls

print(wc(ls("/home/<user>", "-l"), "-l"))

Becomes:

from sh import wc, ls

print(wc("-l", _in=ls("/home/<user>", "-l")))

Or:

from sh import wc, ls

print(wc("-l", _in=ls("/home/<user>", "-l", _return_cmd=True)))

None.

New processes don’t launch in new session

In 1.*, _new_session defaulted to True. It now defaults to False. The reason for this is that it makes more sense for launched processes to default to being in the process group of the Python script, so that they receive SIGINTs correctly.

To preserve the old behavior:

import sh

sh = sh.bake(_new_session=True)

Passing Arguments

When passing multiple arguments to a command, each argument must be a separate string:

from sh import tar
tar("cvf", "/tmp/test.tar", "/my/home/directory/")

This will not work:

from sh import tar
tar("cvf /tmp/test.tar /my/home/directory")

sh supports short-form -a and long-form --arg arguments as keyword arguments:

# resolves to "curl http://duckduckgo.com/ -o page.html --silent"
curl("http://duckduckgo.com/", o="page.html", silent=True)

# or if you prefer not to use keyword arguments, this does the same thing:
curl("http://duckduckgo.com/", "-o", "page.html", "--silent")

# resolves to "adduser amoffat --system --shell=/bin/bash --no-create-home"
adduser("amoffat", system=True, shell="/bin/bash", no_create_home=True)

# or
adduser("amoffat", "--system", "--shell", "/bin/bash", "--no-create-home")

See also

How do I order keyword arguments?

Exit Codes & Exceptions

Normal processes exit with exit code 0. This can be seen from RunningCommand.exit_code:

output = ls("/", _return_cmd=True)
print(output.exit_code) # should be 0

If a process terminates, and the exit code is not 0, an exception is generated dynamically. This lets you catch a specific return code, or catch all error return codes through the base class ErrorReturnCode:

try:
    print(ls("/some/non-existent/folder"))
except ErrorReturnCode_2:
    print("folder doesn't exist!")
    create_the_folder()
except ErrorReturnCode:
    print("unknown error")

You can also customize which exit codes indicate an error with _ok_code. For example:

for i in range(10):
     sh.grep("string to check", f"file_{i}.txt", _ok_code=(0, 1))

where the _ok_code makes a failure to find a match a no-op.

Signals are raised whenever your process terminates from a signal. The exception raised in this situation is SignalException, which subclasses ErrorReturnCode.

try:
    p = sh.sleep(3, _bg=True)
    p.kill()
except sh.SignalException_SIGKILL:
    print("killed")

This behavior could be blocked by appending the negative value of the signal to _ok_code. All signals that raises SignalException are [SIGABRT, SIGBUS, SIGFPE, SIGILL, SIGINT, SIGKILL, SIGPIPE, SIGQUIT, SIGSEGV, SIGTERM, SIGTERM].

Note

You can catch SignalException by using either a number or a signal name. For example, the following two exception classes are equivalent:

assert sh.SignalException_SIGKILL == sh.SignalException_9

Redirection

sh can redirect the STDOUT and STDERR of a process to many different types of targets, using the _out and _err special kwargs.

If a string is used, it is assumed to be a filename. The filename is opened as “wb”, meaning truncate-write and binary mode.

import sh
sh.ifconfig(_out="/tmp/interfaces")

See also

How do I append output to a file?

You may also use any object that supports .write(data), like io.StringIO:

import sh
from io import StringIO

buf = StringIO()
sh.ifconfig(_out=buf)
print(buf.getvalue())

A callback function may also be used as a target. The function must conform to one of three signatures:

fn(data): The function takes just the chunk of data from the process.

fn(data, stdin_queue): In addition to the previous signature, the function also takes a queue.Queue, which may be used to communicate programmatically with the process.

fn(data, stdin_queue, process): In addition to the previous signature, the function takes a weakref.ref to the OProc object.

See also

Output Callbacks

See also

Entering an SSH password

Asynchronous Execution

sh provides a few methods for running commands and obtaining output in a non-blocking fashion.

AsyncIO

Added in version 2.0.0.

Sh supports asyncio on commands with the _async=True special kwarg. This let’s you incrementally await output produced from your command.

import asyncio
import sh

async def main():
    await sh.sleep(3, _async=True)

asyncio.run(main())

Incremental Iteration

You may also create asynchronous commands by iterating over them with the _iter special kwarg. This creates an iterable (specifically, a generator) that you can loop over:

from sh import tail

# runs forever
for line in tail("-f", "/var/log/some_log_file.log", _iter=True):
    print(line)

By default, _iter iterates over STDOUT, but you can change set this specifically by passing either "err" or "out" to _iter (instead of True). Also by default, output is line-buffered, so the body of the loop will only run when your process produces a newline. You can change this by changing the buffer size of the command’s output with _out_bufsize.

Note

If you need a fully non-blocking iterator, use _iter_noblock. If the current iteration would block, errno.EWOULDBLOCK will be returned, otherwise you’ll receive a chunk of output, as normal.

Background Processes

By default, each running command blocks until completion. If you have a long-running command, you can put it in the background with the _bg=True special kwarg:

# blocks
sleep(3)
print("...3 seconds later")

# doesn't block
p = sleep(3, _bg=True)
print("prints immediately!")
p.wait()
print("...and 3 seconds later")

You’ll notice that you need to call RunningCommand.wait() in order to exit after your command exits.

Commands launched in the background ignore SIGHUP, meaning that when their controlling process (the session leader, if there is a controlling terminal) exits, they will not be signalled by the kernel. But because sh commands launch their processes in their own sessions by default, meaning they are their own session leaders, ignoring SIGHUP will normally have no impact. So the only time ignoring SIGHUP will do anything is if you use _new_session=False, in which case the controlling process will probably be the shell from which you launched python, and exiting that shell would normally send a SIGHUP to all child processes.

See also

For more information on the exact launch process, see Architecture Overview.

In combination with _bg=True, sh can use callbacks to process output incrementally by passing a callable function to _out and/or _err. This callable will be called for each line (or chunk) of data that your command outputs:

    from sh import tail

    def process_output(line):
        print(line)

    p = tail("-f", "/var/log/some_log_file.log", _out=process_output, _bg=True)
p.wait()

To control whether the callback receives a line or a chunk, use _out_bufsize. To “quit” your callback, simply return True. This tells the command not to call your callback anymore.

The line or chunk received by the callback can either be of type str or bytes. If the output could be decoded using the provided encoding, a str will be passed to the callback, otherwise it would be raw bytes.

Note

Returning True does not kill the process, it only keeps the callback from being called again. See Interactive callbacks for how to kill a process from a callback.

See also

Function Callback

Commands may communicate with the underlying process interactively through a specific callback signature Each command launched through sh has an internal STDIN queue.Queue that can be used from callbacks:

    def interact(line, stdin):
        if line == "What... is the air-speed velocity of an unladen swallow?":
            stdin.put("What do you mean? An African or European swallow?")

        elif line == "Huh? I... I don't know that....AAAAGHHHHHH":
            cross_bridge()
            return True

        else:
            stdin.put("I don't know....AAGGHHHHH")
            return True

    p = sh.bridgekeeper(_out=interact, _bg=True)
p.wait()

Note

If you use a queue, you can signal the end of the input (EOF) with None

You can also kill or terminate your process (or send any signal, really) from your callback by adding a third argument to receive the process object:

def process_output(line, stdin, process):
    print(line)
    if "ERROR" in line:
        process.kill()
        return True

p = tail("-f", "/var/log/some_log_file.log", _out=process_output, _bg=True)

The above code will run, printing lines from some_log_file.log until the word "ERROR" appears in a line, at which point the tail process will be killed and the script will end.

Note

You may also use RunningCommand.terminate() to send a SIGTERM, or RunningCommand.signal() to send a general signal.

A done callback called when the process exits, either normally (through a success or error exit code) or through a signal. It is always called.

Here’s an example of using _done to create a multiprocess pool, where sh.your_parallel_command is executed concurrently at no more than 10 at a time:

import sh
from threading import Semaphore

pool = Semaphore(10)

def done(cmd, success, exit_code):
    pool.release()

def do_thing(arg):
    pool.acquire()
    return sh.your_parallel_command(arg, _bg=True, _done=done)

procs = []
for arg in range(100):
    procs.append(do_thing(arg))

# essentially a join
[p.wait() for p in procs]

Baking

sh is capable of “baking” arguments into commands. This is essentially partial application, like you might do with functools.partial().

from sh import ls

ls = ls.bake("-la")
print(ls) # "/usr/bin/ls -la"

# resolves to "ls -la /"
print(ls("/"))

The idea here is that now every call to ls will have the “-la” arguments already specified. Baking can become very useful when you combine it with Sub-commands:

from sh import ssh

# calling whoami on a server.  this is a lot to type out, especially if
# you wanted to call many commands (not just whoami) back to back on
# the same server
iam1 = ssh("myserver.com", "-p 1393", "whoami")

# wouldn't it be nice to bake the common parameters into the ssh command?
myserver = ssh.bake("myserver.com", p=1393)

print(myserver) # "/usr/bin/ssh myserver.com -p 1393"

# resolves to "/usr/bin/ssh myserver.com -p 1393 whoami"
iam2 = myserver.whoami()

assert(iam1 == iam2) # True!

Now that the “myserver” callable represents a baked ssh command, you can call anything on the server easily:

# executes "/usr/bin/ssh myserver.com -p 1393 tail /var/log/dumb_daemon.log -n 100"
print(myserver.tail("/var/log/dumb_daemon.log", n=100))

Piping

Bash style piping is performed using function composition. Just pass one command as the input to another’s _in argument, and sh will send the output of the inner command to the input of the outer command:

# sort this directory by biggest file
print(sort("-rn", _in=du(glob("*"), "-sb")))

# print(the number of folders and files in /etc
print(wc("-l", _in=ls("/etc", "-1")))

Note

This basic piping does not flow data through asynchronously; the inner command blocks until it finishes, before sending its data to the outer command.

By default, any command that is piping another command in waits for it to complete. This behavior can be changed with the _piped special kwarg on the command being piped, which tells it not to complete before sending its data, but to send its data incrementally. Read ahead for examples of this.

By default, all piped commands execute sequentially. What this means is that the inner command executes first, then sends its data to the outer command:

print(wc("-l", _in=ls("/etc", "-1")))

In the above example, ls executes, gathers its output, then sends that output to wc. This is fine for simple commands, but for commands where you need parallelism, this isn’t good enough. Take the following example:

for line in tr(_in=tail("-f", "test.log"), "[:upper:]", "[:lower:]", _iter=True):
    print(line)

This won’t work because the tail -f command never finishes. What you need is for tail to send its output to tr as it receives it. This is where the _piped special kwarg comes in handy:

for line in tr(_in=tail("-f", "test.log", _piped=True), "[:upper:]", "[:lower:]", _iter=True):
    print(line)

This works by telling tail -f that it is being used in a pipeline, and that it should send its output line-by-line to tr. By default, _piped sends STDOUT, but you can easily make it send STDERR instead by using _piped="err"

Sub-commands

Many programs have their own command subsets, like git (branch, checkout), svn (update, status), and sudo (where any command following sudo is considered a sub-command). sh handles subcommands through attribute access:

from sh import git, sudo

# resolves to "git branch -v"
print(git.branch("-v"))
print(git("branch", "-v")) # the same command

# resolves to "sudo /bin/ls /root"
print(sudo.ls("/root"))
print(sudo("/bin/ls", "/root")) # the same command

Sub-commands are mainly syntax sugar that makes calling some programs look conceptually nicer.

See also

If you’re using sudo as a subcommand, please be sure to see Using Sudo.

Default Arguments

Many times, you want to override the default arguments of all commands launched through sh. For example, suppose you want the output of all commands to be aggregated into a io.StringIO buffer. The naive way would be this:

import sh
from io import StringIO

buf = StringIO()

sh.ls("/", _out=buf)
sh.whoami(_out=buf)
sh.ps("auxwf", _out=buf)

Clearly, this gets tedious quickly. Fortunately, we can create execution contexts that allow us to set default arguments on all commands spawned from that context:

import sh
from io import StringIO

buf = StringIO()
sh2 = sh.bake(_out=buf)

sh2.ls("/")
sh2.whoami()
sh2.ps("auxwf")

Now, anything launched from sh2 will send its output to the StringIO instance buf.

Environments

The _env special kwarg allows you to pass a dictionary of environment variables and their corresponding values:

import sh
sh.google_chrome(_env={"SOCKS_SERVER": "localhost:1234"})

_env replaces your process’s environment completely. Only the key-value pairs in _env will be used for its environment. If you want to add new environment variables for a process in addition to your existing environment, try something like this:

import os
import sh

new_env = os.environ.copy()
new_env["SOCKS_SERVER"] = "localhost:1234"

sh.google_chrome(_env=new_env)

See also

To make an environment apply to all sh commands look into Default Arguments.

Input via STDIN

STDIN is sent to a process directly by using a command’s _in special kwarg:

print(cat(_in="test"))

Any command that takes input from STDIN can be used this way:

print(tr("[:lower:]", "[:upper:]", _in="sh is awesome"))

You’re also not limited to using just strings. You may use a file object, a queue.Queue, or any iterable (list, set, dictionary, etc):

stdin = ["sh", "is", "awesome"]
out = tr("[:lower:]", "[:upper:]", _in=stdin)

Note

If you use a queue, you can signal the end of the queue (EOF) with None

‘With’ Contexts

Commands can be run within a Python with context. Popular commands using this might be sudo or fakeroot:

with sh.contrib.sudo:
    print(ls("/root"))

See also

sh.contrib.sudo

If you need to run a command in a with context and pass in arguments, for example, specifying a -p prompt with sudo, you need to use the _with=True This let’s the command know that it’s being run from a with context so it can behave correctly:

with sh.contrib.sudo(k=True, _with=True):
    print(ls("/root"))

Special Kwargs

These arguments alter a command’s behavior. They are not passed to the program. You can use them on any command that you run, but some may not be used together. sh will tell you if there are conflicts.

To set default special keyword arguments on every command run, you may use Default Arguments.

Controlling Output

Default value: None

What to redirect STDOUT to. If this is a string, it will be treated as a file name. You may also pass a file object (or file-like object), an int (representing a file descriptor, like the result of os.pipe()), a io.StringIO object, or a callable.

import sh
sh.ls(_out="/tmp/output")

See also

Redirection

Default value: None

What to redirect STDERR to. See _out.

Default value: False

If True, duplicate the file descriptor bound to the process’s STDOUT also to STDERR, effectively causing STDERR and STDOUT to go to the same place.

Default value: sh.DEFAULT_ENCODING

The character encoding of the process’s STDOUT. By default, this is the locale’s default encoding.

Added in version 1.07.0.

Default value: "strict"

This is how Python should handle decoding errors of the process’s output. By default, this is "strict", but you can use any value that’s valid to bytes.decode(), such as "ignore".

Added in version 1.07.0.

Default value: None

As of 1.07.0, any time redirection is used, either for STDOUT or STDERR, the respective internal buffers are not filled. For example, if you’re downloading a file and using a callback on STDOUT, the internal STDOUT buffer, nor the pipe buffer be filled with data from STDOUT. This option forces one of stderr (_tee='err') or stdout (_tee='out' or _tee=True) to be filled anyways, in effect “tee-ing” the output into two places (the callback/redirect handler, and the internal buffers).

Added in version 1.12.0.

Default value: True

Whether or not exception output should be truncated.

Execution

Added in version 1.12.0.

Default value: False

Runs a command in the foreground, meaning it is spawned using os.spawnle(). The current process’s STDIN/OUT/ERR is os.dup2()’d to the new process and so the new process becomes the foreground of the shell executing the script. This is only really useful when you want to launch a lean, interactive process that sh is having trouble running, for example, ssh.

Warning

_fg=True side-steps a lot of sh’s functionality. You will not be returned a process object and most (likely all) other special kwargs will not work.

If you are looking for similar functionality, but still retaining sh’s features, use the following:

import sh
import sys
sh.your_command(_in=sys.stdin, _out=sys.stdout, _err=sys.stderr)

Default value: False

Runs a command in the background. The command will return immediately, and you will have to run RunningCommand.wait() on it to ensure it terminates.

See also

Background Processes.

Added in version 1.12.9.

Default value: True

Automatically report exceptions for the background command. If you set this to False you should make sure to call RunningCommand.wait() or you may swallow exceptions that happen in the background command.

Added in version 2.0.0.

Default value: False

Allows your command to become awaitable. Use in combination with _iter and async for to incrementally await output as it is produced.

Default value: None

A dictionary defining the only environment variables that will be made accessible to the process. If not specified, the calling process’s environment variables are used.

Note

This dictionary is the authoritative environment for the process. If you wish to change a single variable in your current environment, you must pass a copy of your current environment with the overridden variable to sh.

See also

Environments

Default value: None

How much time, in seconds, we should give the process to complete. If the process does not finish within the timeout, it will be sent the signal defined by _timeout_signal.

Default value: signal.SIGKILL

The signal to be sent to the process if _timeout is not None.

Default value: None

A string that sets the current working directory of the process.

Default value: 0

Either an integer, a list, or a tuple containing the exit code(s) that are considered “ok”, or in other words: do not raise an exception. Some misbehaved programs use exit codes other than 0 to indicate success.

import sh
sh.weird_program(_ok_code=[0,3,5])

If the process is killed by a signal, a SignalException is raised by default. This behavior could be blocked by appending a negative number to _ok_code that represents the signal.

import sh
# the process won't raise SignalException if SIGINT, SIGKILL, or SIGTERM
# are sent to kill the process
p = sh.sleep(3, _bg=True, _ok_code=[0, -2, -9, -15])

# No exception will be raised here
p.kill()

See also

Exit Codes & Exceptions

Default value: False

Determines if our forked process will be executed in its own session via os.setsid().

Changed in version 2.0.0: The default value of _new_session was changed from True to False because it makes more sense for a launched process to default to being in the process group of python script, so that it receives SIGINTs correctly.

See also

Architecture Overview

Default value: False

Determines if our forked process will be executed in its own group via os.setpgid()

Added in version 1.12.0.

Default value: None

The user id to assume before the child process calls os.execv().

Added in version 1.12.0.

Default value: None

A function to be run directly before the child process calls os.execv(). Typically not used by normal users.

Added in version 1.13.0.

Default value: {} (empty set)

A whitelist iterable of integer file descriptors to be inherited by the child. Passing anything in this argument causes _close_fds to be True.

Added in version 1.13.0.

Default value: True

Causes all inherited file descriptors besides stdin, stdout, and stderr to be automatically closed. This option is automatically enabled when _pass_fds is given a value.

Communication

Default value: None

Specifies an argument for the process to use as its standard input. This may be a string, a queue.Queue, a file-like object, or any iterable.

See also

Input via STDIN

Default value: None

May be True, "out", or "err". Signals a command that it is being used as the input to another command, so it should return its output incrementally as it receives it, instead of aggregating it all at once.

See also

Advanced Piping

Default value: None

May be True, "out", or "err". Puts a command in iterable mode. In this mode, you can use a for or while loop to iterate over a command’s output in real-time.

import sh
for line in sh.cat("/tmp/file", _iter=True):
    print(line)

See also

Incremental Iteration.

Default value: None

Same as _iter, except the loop will not block if there is no output to iterate over. Instead, the output from the command will be errno.EWOULDBLOCK.

import sh
import errno
import time

for line in sh.tail("-f", "stuff.log", _iter_noblock=True):
    if line == errno.EWOULDBLOCK:
        print("doing something else...")
        time.sleep(0.5)
    else:
        print("processing line!")

See also

Incremental Iteration.

Default value: False

Explicitly tells us that we’re running a command in a with context. This is only necessary if you’re using a command in a with context and passing parameters to it.

import sh
with sh.contrib.sudo(password="abc123", _with=True):
    print(sh.ls("/root"))

See also

‘With’ Contexts

Added in version 1.11.0.

Default value: None

A callback that is always called when the command completes, even if it completes with an exit code that would raise an exception. After the callback is run, any exception that would be raised is raised.

The callback is passed the RunningCommand instance, a boolean indicating success, and the exit code.

Here’s an example of using _done to create a multiprocess pool, where sh.your_parallel_command is executed concurrently at no more than 10 at a time:

import sh
from threading import Semaphore

pool = Semaphore(10)

def done(cmd, success, exit_code):
    pool.release()

def do_thing(arg):
    pool.acquire()
    return sh.your_parallel_command(arg, _bg=True, _done=done)

procs = []
for arg in range(100):
    procs.append(do_thing(arg))

# essentially a join
[p.wait() for p in procs]

TTYs

Default value: False, meaning a os.pipe() will be used.

If True, sh creates a TTY for STDIN, essentially emulating a terminal, as if your command was entered from the commandline. This is necessary for commands that require STDIN to be a TTY.

Default value: True

If True, sh creates a TTY for STDOUT, otherwise use a os.pipe(). This is necessary for commands that require STDOUT to be a TTY.

See also

Why is _tty_out=True the default?

Added in version 1.13.0.

Default value: False

If True, sh will combine the STDOUT and STDIN TTY into a single pseudo-terminal. This is sometimes required by picky programs which expect to be dealing with a single pseudo-terminal, like SSH.

See also

Entering an SSH password

Default value: (20, 80)

The (rows, columns) of stdout’s TTY. Changing this may affect how much your program prints per line, for example.

Performance & Optimization

Default value: 0

The STDIN buffer size. 0 for unbuffered, 1 for line buffered, anything else for a buffer of that amount.

Default value: 1

The STDOUT buffer size. 0 for unbuffered, 1 for line buffered, anything else for a buffer of that amount.

Default value: 1

Same as _out_bufsize, but with STDERR.

Default value: 3 * 1024**2 chunks

How much of STDOUT/ERR your command will store internally. This value represents the number of bufsize chunks not the total number of bytes. For example, if this value is 100, and STDOUT is line buffered, you will be able to retrieve 100 lines from STDOUT. If STDOUT is unbuffered, you will be able to retrieve only 100 characters.

Added in version 1.07.0.

Default value: False

Disables STDOUT being internally stored. This is useful for commands that produce huge amounts of output that you don’t need, that would otherwise be hogging memory if stored internally by sh.

Added in version 1.07.0.

Default value: False

Disables STDERR being internally stored. This is useful for commands that produce huge amounts of output that you don’t need, that would otherwise be hogging memory if stored internally by sh.

Added in version 1.07.0.

Default value: False

Similar to _no_out, this explicitly tells the sh command that it will never be used for piping its output into another command, so it should not fill its internal pipe buffer with the process’s output. This is also useful for conserving memory.

Program Arguments

These are options that affect how command options are fed into the program.

Added in version 1.12.0.

Default value: "="

This is the character(s) that separate a program’s long argument’s key from the value, when using kwargs to specify your program’s long arguments. For example, if your program expects a long argument in the form --name value, the way to achieve this would be to set _long_sep=" ".

import sh
sh.your_program(key=value, _long_sep=" ")

Would send the following list of arguments to your program:

["--key value"]

If your program expects the long argument name to be separate from its value, pass None into _long_sep instead:

import sh
sh.your_program(key=value, _long_sep=None)

Would send the following list of arguments to your program:

["--key", "value"]

Added in version 1.12.0.

Default value: "--"

This is the character(s) that prefix a long argument for the program being run. Some programs use single dashes, for example, and do not understand double dashes.

Added in version 1.12.0.

Default value: None

This is an advanced option that allows you to rewrite a command’s arguments on the fly, based on other command arguments, or some other variable. It is really only useful in conjunction with baking, and only currently used when constructing contrib wrappers.

Example:

import sh

def processor(args, kwargs):
    return args, kwargs

my_ls = sh.bake.ls(_arg_preprocess=processor)

Warning

The interface to the _arg_preprocess function may change without warning. It is generally only for internal sh use, so don’t use it unless you absolutely have to.

Misc

Default value: None

Added in version 1.12.0.

This allows for a custom logging header for Command Class instances. For example, the default logging looks like this:

import logging
import sh

logging.basicConfig(level=logging.INFO)

sh.ls("-l")

INFO:sh.command:<Command '/bin/ls -l'>: starting process
INFO:sh.command:<Command '/bin/ls -l', pid 28952>: process started
INFO:sh.command:<Command '/bin/ls -l', pid 28952>: process completed

People can find this <Command .. section long and not relevant. _log_msg allows you to customize this:

import logging
import sh

logging.basicConfig(level=logging.INFO)

def custom_log(ran, call_args, pid=None):
    return ran

sh.ls("-l", _log_msg=custom_log)

INFO:sh.command:/bin/ls -l: starting process
INFO:sh.command:/bin/ls -l: process started
INFO:sh.command:/bin/ls -l: process completed

The first argument, ran, is the program’s execution string and arguments, as close as we can get it to be how you’d type in the shell. call_args is a dictionary of all of the special kwargs that were passed to the command. And pid is the process id of the forked process. It defaults to None because the _log_msg callback is actually called twice: first to construct the logger for the RunningCommand Class instance, before the process itself is spawned, then a second time after the process is spawned via OProc Class, when we have a pid.

Architecture Overview

Launch

When it comes time to launch a process

Open pipes and/or TTYs STDIN/OUT/ERR.
Open a pipe for communicating pre-exec exceptions from the child to the parent.
Open a pipe for child/parent launch synchronization.
os.fork() a child process.

From here, we have two concurrent processes running:

If _bg=True is set, we ignore signal.SIGHUP.
If _new_session=True, become a session leader with os.setsid(), else become a process group leader with os.setpgrp().
Write our session id to the a pipe connected to the parent. This is mainly to synchronize with our parent that our session/group logic has finished.
os.dup2() the file descriptors of our previously-setup TTYs/pipes to our STDIN/OUT/ERR file descriptors.
If we’re a session leader and our STDIN is a TTY, via _tty_in=True, acquire a controlling terminal, thereby becoming the controlling process of the session.
Set our GID/UID if we’ve set a custom one via _uid.
Close all file descriptors greater than STDERR.
Call os.execv().

Check for any exceptions via the exception pipe connected to the child.
Block and read our child’s session id from a pipe connected to the child. This synchronizes to us that the child has finished moving between sessions/groups and we can now accurately determine its current session id and process group.
If we’re using a TTY for STDIN, via _tty_in=True, disable echoing on the TTY, so that data sent to STDIN is not echoed to STDOUT.

Running

An instance of OProc Class contains two internal threads, one for STDIN, and one for STDOUT and STDERR. The purpose of these threads is to handle reading/writing to the read/write ends of the process’s standard descriptors.

For example, the STDOUT/ERR thread continually runs select.select() on the master ends of the TTYs/pipes connected to STDOUT/ERR, and if they’re ready to read, reads the available data and aggregates it into the appropriate place.

A couple of different buffers must be considered when thinking about how data flows through an sh process.

The first buffer is the buffer associated with the underlying pipe or TTY attached to STDOUT/ERR. In the case of a TTY (the default for output), the buffer size is 0, so output is immediate – a byte written by the process is a byte received by sh. For a pipe, however, the buffer size of the pipe is typically 4-64kb. pipe(2).

See also

FAQ: Why is _tty_out=True the default?

The second buffer is sh’s internal buffers, one for STDOUT and one for STDERR. These buffers aggregate data that has been read from the master end of the TTY or pipe attached to the output fd, but before that data is sent along to the appropriate output handler (queue, file object, function, etc). Data sits in these buffers until we reach the size specified with _internal_bufsize, at which point the buffer flushes to the output handler.

Exit

On process completion, our internal threads must complete, as the read end of STDIN, for example, which is connected to the process, is no longer open, so writing to the slave end will no longer work.

The STDOUT/ERR thread is a little more complicated, because although the process is not alive, output data may still exist in the pipe/TTY buffer that must be collected. So we essentially just select.select() on the read ends until they return nothing, indicating that they are complete, then we break out of our read loop.

The exit code is obtained from the reaped process. If the process ended from a signal, the exit code is the negative value of that signal. For example, SIGKILL would result in an exit code -9.

If specified, the _done callback is executed with the RunningCommand instance, a boolean indicating success, and the adjusted exit code. After the callback returns, error processing continues. In other words, the done callback is called regardless of success or failure, and there’s nothing it can do to prevent the ErrorReturnCode exceptions from being raised after it completes.

API

Command Class

The Command class represents a program that exists on the system and can be run at some point in time. An instance of Command is never running; an instance of RunningCommand is spawned for that.

An instance of Command can take the form of a manually instantiated object, or as an object instantiated by dynamic lookup:

import sh

ls1 = sh.Command("ls")
ls2 = sh.ls

assert ls1 == ls2

class Command(name, search_paths=None)

Instantiates a Command instance, where name is the name of a program that exists on the user’s $PATH, or is a full path itself. If search_paths is specified, it must be a list of all the paths to look for the program name.

from sh import Command

ifconfig = Command("ifconfig")
ifconfig = Command("/sbin/ifconfig")

Command.bake(*args, **kwargs)

Returns a new Command with *args and **kwargs baked in as positional and keyword arguments, respectively. Any future calls to the returned Command will include *args and **kwargs automatically:

from sh import ls

long_ls = ls.bake("-l")
print(ls("/var"))
print(ls("/tmp"))

See also

Baking

Similar to the above, arguments to the sh.Command must be separate. e.g. the following does not work:

lscmd = sh.Command("/bin/ls -l")
tarcmd = sh.Command("/bin/tar cvf /tmp/test.tar /my/home/directory/")

You will run into CommandNotFound(path) exception even when correct full path is specified. The correct way to do this is to :

build Command object using only the binary
pass the arguments to the object when invoking

as follows:

lscmd = sh.Command("/bin/ls")
lscmd("-l")
tarcmd = sh.Command("/bin/tar")
tarcmd("cvf", "/tmp/test.tar", "/my/home/directory/")

RunningCommand Class

This represents a Command instance that has been or is being executed. It exists as a wrapper around the low-level OProc. Most of your interaction with sh objects are with instances of this class. It is only returned if _return_cmd=True when you execute a command.

Warning

Objects of this class behave very much like strings. This was an intentional design decision to make the “output” of an executing Command behave more intuitively.

Be aware that functions that accept real strings only, for example json.dumps, will not work on instances of RunningCommand, even though it look like a string.

RunningCommand.wait(timeout=None)

Parameters:: timeout – An optional non-negative number to wait for the command to complete. If it doesn’t complete by the timeout, we raise TimeoutException.

Block and wait for the command to finish execution and obtain an exit code. If the exit code represents a failure, we raise the appropriate exception. See exceptions.

Note

Calling this method multiple times only yields an exception on the first call.

This is called automatically by sh unless your command is being executed asynchronously, in which case, you may want to call this manually to ensure completion.

If an instance of Command is being used as the stdin argument (see piping), wait() is also called on that instance, and any exceptions resulting from that process are propagated up.

RunningCommand.process: The underlying OProc instance.

RunningCommand.stdout: A @property that calls wait() and then returns the contents of what the process wrote to stdout.

RunningCommand.stderr: A @property that calls wait() and then returns the contents of what the process wrote to stderr.

RunningCommand.exit_code: A @property that calls wait() and then returns the process’s exit code.

RunningCommand.pid: The process id of the process.

RunningCommand.sid: The session id of the process. This will typically be a different session than the current python process, unless _new_session=False was specified.

RunningCommand.pgid: The process group id of the process.

RunningCommand.ctty: The controlling terminal device, if there is one.

RunningCommand.signal(sig_num): Sends sig_num to the process. Typically used with a value from the signal module, like signal.SIGHUP (see signal(7)).

RunningCommand.signal_group(sig_num): Sends sig_num to every process in the process group. Typically used with a value from the signal module, like signal.SIGHUP (see signal(7)).

RunningCommand.terminate(): Shortcut for RunningCommand.signal(signal.SIGTERM).

RunningCommand.kill(): Shortcut for RunningCommand.signal(signal.SIGKILL).

RunningCommand.kill_group(): Shortcut for RunningCommand.signal_group(signal.SIGKILL).

RunningCommand.is_alive()

Returns whether or not the process is still alive.

Return type:: bool

OProc Class

Warning

Don’t use instances of this class directly. It is being documented here for posterity, not for direct use.

OProc.wait(): Block until the process completes, aggregate the output, and populate OProc.exit_code.

OProc.stdout: A collections.deque, sized to _internal_bufsize items, that contains the process’s STDOUT.

OProc.stderr: A collections.deque, sized to _internal_bufsize items, that contains the process’s STDERR.

OProc.exit_code: Contains the process’s exit code, or None if the process has not yet exited.

OProc.pid: The process id of the process.

OProc.sid: The session id of the process. This will typically be a different session than the current python process, unless _new_session=False was specified.

OProc.pgid: The process group id of the process.

OProc.ctty: The controlling terminal device, if there is one.

OProc.signal(sig_num): Sends sig_num to the process. Typically used with a value from the signal module, like signal.SIGHUP (see signal(7)).

OProc.signal_group(sig_num): Sends sig_num to every process in the process group. Typically used with a value from the signal module, like signal.SIGHUP (see signal(7)).

OProc.terminate(): Shortcut for OProc.signal(signal.SIGTERM).

OProc.kill(): Shortcut for OProc.signal(signal.SIGKILL).

OProc.kill_group(): Shortcut for OProc.signal_group(signal.SIGKILL).

Exceptions

class ErrorReturnCode: This is the base class for, as the name suggests, error return codes. It subclasses Exception.

ErrorReturnCode.full_cmd: The full command that was executed, as a string, so that you can try it on the commandline if you wish.

ErrorReturnCode.stdout: The total aggregated STDOUT for the process.

ErrorReturnCode.stderr: The total aggregated STDERR for the process.

ErrorReturnCode.exit_code: The process’s adjusted exit code.

See also

Exit Code Processing

Subclasses ErrorReturnCode. Raised when a command receives a signal that causes it to exit.

Raised when a command specifies a non-null _timeout and the command times out:

import sh

try:
    sh.sleep(10, _timeout=1)
except sh.TimeoutException:
    print("we timed out, as expected")

Also raised when you specify a timeout to RunningCommand.wait(timeout=None):

import sh

p = sh.sleep(10, _bg=True)
try:
    p.wait(timeout=1)
except sh.TimeoutException:
    print("we timed out waiting")
    p.kill()

This exception is raised in one of the following conditions:

The program cannot be found on your path.
You do not have permissions to execute the program.
The program is not marked executable.

The last two bullets may seem strange, but they fall in line with how a shell like Bash behaves when looking up a program to execute.

Note

CommandNotFound subclasses AttributeError. As such, the repr of it is simply the name of the missing attribute.

Helper Functions

which(name, search_paths=None): Resolves name to program’s absolute path, or None if it cannot be found. If search_paths is list of paths, use that list to look for the program, otherwise use the environment variable $PATH.

pushd(directory)

This function provides a with context that behaves similar to Bash’s pushd by pushing to the provided directory, and popping out of it at the end of the context.

import sh

with sh.pushd("/tmp"):
    sh.touch("a_file")

Note

It should be noted that we use a reentrant lock, so that different threads using this function will have the correct behavior inside of their with contexts.

Full Documentation

sh

Installation

Quick Reference

Single Page

Entering an SSH password

Tailing a real-time log file

FAQ

Contrib Commands

Commands

Extending

Using Sudo

Migrating from 1.x to 2.x

sh.cd builtin removed

Removed execution contexts / default arguments

Return value now a true string

Piping to STDIN

New processes don’t launch in new session

Passing Arguments

Exit Codes & Exceptions

Redirection

Asynchronous Execution

AsyncIO

Incremental Iteration

Background Processes

Baking

Piping

Sub-commands

Default Arguments

Environments

Input via STDIN

‘With’ Contexts

Special Kwargs

Controlling Output

Execution

Communication

TTYs

Performance & Optimization

Program Arguments

Misc

Architecture Overview

Launch

Running

Exit

API

Command Class

RunningCommand Class

OProc Class

Exceptions

Helper Functions

`sh.cd` builtin removed