The process_utils
Module¶
This module provides functions for doing process management.
These are the main sections of this module:
Asynchronous Process Utilities¶
There is a function and class which can be used together with your custom Tollius or asyncio run loop.
The osrf_pycommon.process_utils.async_execute_process()
function is a coroutine which allows you to run a process and get the output back bit by bit in real-time, either with stdout and stderr separated or combined.
This function also allows you to emulate the terminal using a pty simply by toggling a flag in the parameters.
Along side this coroutine is a Protocol class, osrf_pycommon.process_utils.AsyncSubprocessProtocol
, from which you can inherit in order to customize how the yielded output is handled.
Because this coroutine is built on the trollius
/asyncio
framework’s subprocess functions, it is portable and should behave the same on all major OS’s. (including on Windows where an IOCP implementation is used)
-
osrf_pycommon.process_utils.
async_execute_process
(protocol_class, cmd=None, cwd=None, env=None, shell=False, emulate_tty=False, stderr_to_stdout=True)[source]¶ Coroutine to execute a subprocess and yield the output back asynchronously.
This function is meant to be used with the Python
asyncio
module, which is available via pip with Python 3.3 and built-in to Python 3.4. On Python >= 2.6 you can use thetrollius
module to get the same functionality, but without using the newyield from
syntax.Here is an example of how to use this function:
import asyncio from osrf_pycommon.process_utils import async_execute_process from osrf_pycommon.process_utils import AsyncSubprocessProtocol from osrf_pycommon.process_utils import get_loop @asyncio.coroutine def setup(): transport, protocol = yield from async_execute_process( AsyncSubprocessProtocol, ['ls', '/usr']) returncode = yield from protocol.complete return returncode retcode = get_loop().run_until_complete(setup()) get_loop().close()
That same example using
trollius
would look like this:import trollius as asyncio from osrf_pycommon.process_utils import async_execute_process from osrf_pycommon.process_utils import AsyncSubprocessProtocol from osrf_pycommon.process_utils import get_loop @asyncio.coroutine def setup(): transport, protocol = yield asyncio.From(async_execute_process( AsyncSubprocessProtocol, ['ls', '/usr'])) returncode = yield asyncio.From(protocol.complete) raise asyncio.Return(returncode) retcode = get_loop().run_until_complete(setup()) get_loop().close()
This difference is required because in Python < 3.3 the
yield from
syntax is not valid.In both examples, the first argument is the default
AsyncSubprocessProtocol
protocol class, which simply prints output from stdout to stdout and output from stderr to stderr.If you want to capture and do something with the output or write to the stdin, then you need to subclass from the
AsyncSubprocessProtocol
class, and override theon_stdout_received
,on_stderr_received
, andon_process_exited
functions.See the documentation for the
AsyncSubprocessProtocol
class for more details, but here is an example which uses asyncio from Python 3.4:import asyncio from osrf_pycommon.process_utils import async_execute_process from osrf_pycommon.process_utils import AsyncSubprocessProtocol from osrf_pycommon.process_utils import get_loop class MyProtocol(AsyncSubprocessProtocol): def __init__(self, file_name, **kwargs): self.fh = open(file_name, 'w') AsyncSubprocessProtocol.__init__(self, **kwargs) def on_stdout_received(self, data): # Data has line endings intact, but is bytes in Python 3 self.fh.write(data.decode('utf-8')) def on_stderr_received(self, data): self.fh.write(data.decode('utf-8')) def on_process_exited(self, returncode): self.fh.write("Exited with return code: {0}".format(returncode)) self.fh.close() @asyncio.coroutine def log_command_to_file(cmd, file_name): def create_protocol(**kwargs): return MyProtocol(file_name, **kwargs) transport, protocol = yield from async_execute_process( create_protocol, cmd) returncode = yield from protocol.complete return returncode get_loop().run_until_complete( log_command_to_file(['ls', '/'], '/tmp/out.txt')) get_loop().close()
See the
subprocess.Popen
class for more details on some of the parameters to this function likecwd
,env
, andshell
.See the
osrf_pycommon.process_utils.execute_process()
function for more details on theemulate_tty
parameter.Parameters: - protocol_class (
AsyncSubprocessProtocol
or a subclass) – Protocol class which handles subprocess callbacks - cmd (list) – list of arguments where the executable is the first item
- cwd (str) – directory in which to run the command
- env (dict) – a dictionary of environment variable names to values
- shell (bool) – if True, the
cmd
variable is interpreted by a the shell - emulate_tty (bool) – if True, pty’s are passed to the subprocess for stdout
and stderr, see
osrf_pycommon.process_utils.execute_process()
. - stderr_to_stdout (bool) – if True, stderr is directed to stdout, so they are not captured separately.
- protocol_class (
-
class
osrf_pycommon.process_utils.
AsyncSubprocessProtocol
(stdin=None, stdout=None, stderr=None)[source]¶ Protocol to subclass to get events from
async_execute_process()
.When subclassing this Protocol class, you should override these functions:
def on_stdout_received(self, data): # ... def on_stderr_received(self, data): # ... def on_process_exited(self, returncode): # ...
By default these functions just print the data received from stdout and stderr and does nothing when the process exits.
Data received by the
on_stdout_received
andon_stderr_received
functions is always in bytes (str
in Python2 andbytes
in Python3). Therefore, it may be necessary to call.decode()
on the data before printing to the screen.Additionally, the data received will not be stripped of new lines, so take that into consideration when printing the result.
You can also override these less commonly used functions:
def on_stdout_open(self): # ... def on_stdout_close(self, exc): # ... def on_stderr_open(self): # ... def on_stderr_close(self, exc): # ...
These functions are called when stdout/stderr are opened and closed, and can be useful when using pty’s for example. The
exc
parameter of the*_close
functions is None unless there was an exception.In addition to the overridable functions this class has a few useful public attributes. The
stdin
attribute is a reference to the PipeProto which follows theasyncio.WriteTransport
interface. Thestdout
andstderr
attributes also reference their PipeProto. Thecomplete
attribute is aasyncio.Future
which is set to complete when the process exits and its result is the return code.The
complete
attribute can be used like this:import asyncio from osrf_pycommon.process_utils import async_execute_process from osrf_pycommon.process_utils import AsyncSubprocessProtocol from osrf_pycommon.process_utils import get_loop @asyncio.coroutine def setup(): transport, protocol = yield from async_execute_process( AsyncSubprocessProtocol, ['ls', '-G', '/usr']) retcode = yield from protocol.complete print("Exited with", retcode) # This will block until the protocol.complete Future is done. get_loop().run_until_complete(setup()) get_loop().close()
In addtion to these functions, there is a utility function for getting the correct asyncio
event loop:
Treatment of File Descriptors¶
Unlike subprocess.Popen
, all of the process_utils
functions behave the same way on Python versions 2.7 through 3.4, and they do not close inheritable <https://docs.python.org/3.4/library/os.html#fd-inheritance>. file descriptors before starting subprocesses. This is equivalent to passing close_fds=False
to subprocess.Popen
on all Python versions.
In Python 3.2, the subprocess.Popen
default for the close_fds
option changed from False
to True
so that file descriptors opened by the parent process were closed before spawning the child process. In Python 3.4, PEP 0446 additionally made it so even when close_fds=False
file descriptors which are non-inheritable are still closed before spawning the subprocess.
If you want to be able to pass file descriptors to subprocesses in Python 3.4 or higher, you will need to make sure they are inheritable <https://docs.python.org/3.4/library/os.html#fd-inheritance>.
Synchronous Process Utilities¶
For synchronous execution and output capture of subprocess, there are two functions:
These functions are not yet using the trollius
/asyncio
framework as a back-end and therefore on Windows will not stream the data from the subprocess as it does on Unix machines.
Instead data will not be yielded until the subprocess is finished and all output is buffered (the normal warnings about long running programs with lots of output apply).
The streaming of output does not work on Windows because on Windows the select.select()
method only works on sockets and not file-like objects which are used with subprocess pipes.
asyncio
implements Windows subprocess support by implementing a Proactor event loop based on Window’s IOCP API.
One future option will be to implement this synchronous style method using IOCP in this module, but another option is to just make synchronous the asynchronous calls, but there are issues with that as well.
In the mean time, if you need streaming of output in both Windows and Unix, use the asynchronous calls.
-
osrf_pycommon.process_utils.
execute_process
(cmd, cwd=None, env=None, shell=False, emulate_tty=False)[source]¶ Executes a command with arguments and returns output line by line.
All arguments, except
emulate_tty
, are passed directly tosubprocess.Popen
.execute_process
returns a generator which yields the output, line by line, until the subprocess finishes at which point the return code is yielded.This is an example of how this function should be used:
from __future__ import print_function from osrf_pycommon.process_utils import execute_process cmd = ['ls', '-G'] for line in execute_process(cmd, cwd='/usr'): if isinstance(line, int): # This is a return code, the command has exited print("'{0}' exited with: {1}".format(' '.join(cmd), line)) continue # break would also be appropriate here # In Python 3, it will be a bytes array which needs to be decoded if not isinstance(line, str): line = line.decode('utf-8') # Then print it to the screen print(line, end='')
stdout
andstderr
are always captured together and returned line by line through the returned generator. New line characters are preserved in the output, so if re-printing the data take care to useend=''
or firstrstrip
the output lines.When
emulate_tty
is used on Unix systems, commands will identify that they are on a tty and should output color to the screen as if you were running it on the terminal, and therefore there should not be any need to pass arguments like-c color.ui=always
to commands likegit
. Additionally, programs might also behave differently in whenemulate_tty
is being used, for example, Python will default to unbuffered output when it detects a tty.emulate_tty
works by using psuedo-terminals on Unix machines, and so if you are running this command many times in parallel (like hundreds of times) then you may get one of a few differentOSError
‘s. For example, “OSError: [Errno 24] Too many open files: ‘/dev/ttyp0’” or “OSError: out of pty devices”. You should also be aware that you share pty devices with the rest of the system, so even if you are not using a lot, it is possible to get this error. You can catch this error before getting data from the generator, so when usingemulate_tty
you might want to do something like this:from __future__ import print_function from osrf_pycommon.process_utils import execute_process cmd = ['ls', '-G', '/usr'] try: output = execute_process(cmd, emulate_tty=True) except OSError: output = execute_process(cmd, emulate_tty=False) for line in output: if isinstance(line, int): print("'{0}' exited with: {1}".format(' '.join(cmd), line)) continue # In Python 3, it will be a bytes array which needs to be decoded if not isinstance(line, str): line = line.decode('utf-8') print(line, end='')
This way if a pty cannot be opened in order to emulate the tty then you can try again without emulation, and any other
OSError
should raise again withemulate_tty
set toFalse
. Obviously, you only want to do this if emulating the tty is non-critical to your processing, like when you are using it to capture color.Any color information that the command outputs as ANSI escape sequences is captured by this command. That way you can print the output to the screen and preserve the color formatting.
If you do not want color to be in the output, then try setting
emulate_tty
toFalse
, but that does not guarantee that there is no color in the output, instead it only will cause called processes to identify that they are not being run in a terminal. Most well behaved programs will not output color if they detect that they are not being executed in a terminal, but you shouldn’t rely on that.If you want to ensure there is no color in the output from an executed process, then use this function:
osrf_pycommon.terminal_color.remove_ansi_escape_senquences()
Exceptions can be raised by functions called by the implementation, for example,
subprocess.Popen
can raise anOSError
when the given command is not found. If you want to check for the existence of an executable on the path, see:which()
. However, this function itself does not raise any special exceptions.Parameters: - cmd (list) – list of strings with the first item being a command
and subsequent items being any arguments to that command;
passed directly to
subprocess.Popen
. - cwd (str) – path in which to run the command, defaults to None which
means
os.getcwd()
is used; passed directly tosubprocess.Popen
. - env (dict) – environment dictionary to use for executing the command,
default is None which uses the
os.environ
environment; passed directly tosubprocess.Popen
. - shell (bool) – If True the system shell is used to evaluate the
command, default is False;
passed directly to
subprocess.Popen
. - emulate_tty (bool) – If True attempts to use a pty to convince subprocess’s that they are being run in a terminal. Typically this is useful for capturing colorized output from commands. This does not work on Windows (no pty’s), so it is considered False even when True. Defaults to False.
Returns: a generator which yields output from the command line by line
Return type: generator which yields strings
- cmd (list) – list of strings with the first item being a command
and subsequent items being any arguments to that command;
passed directly to
Availability: Unix (streaming), Windows (blocking)
-
osrf_pycommon.process_utils.
execute_process_split
(cmd, cwd=None, env=None, shell=False, emulate_tty=False)[source]¶ execute_process()
, exceptstderr
is returned separately.Instead of yielding output line by line until yielding a return code, this function always a triplet of
stdout
,stderr
, and return code. Each time only one of the three will not be None. Once you receive a non-None return code (type will be int) there will be no morestdout
orstderr
. Therefore you can use the command like this:from __future__ import print_function import sys from osrf_pycommon.process_utils import execute_process_split cmd = ['time', 'ls', '-G'] for out, err, ret in execute_process_split(cmd, cwd='/usr'): # In Python 3, it will be a bytes array which needs to be decoded out = out.decode('utf-8') if out is not None else None err = err.decode('utf-8') if err is not None else None if ret is not None: # This is a return code, the command has exited print("'{0}' exited with: {1}".format(' '.join(cmd), ret)) break if out is not None: print(out, end='') if err is not None: print(err, end='', file=sys.stderr)
When using this, it is possible that the
stdout
andstderr
data can be returned in a different order than what would happen on the terminal. This is due to the fact that the subprocess is given different buffers forstdout
andstderr
and so there is a race condition on the subprocess writing to the different buffers and this command reading the buffers. This can be avoided in most scenarios by usingemulate_tty
, because of the use ofpty
‘s, though the ordering can still not be guaranteed and the number ofpty
‘s is finite as explained in the documentation forexecute_process()
. For situations where output ordering betweenstdout
andstderr
are critical, they should not be returned separately and instead should share one buffer, and soexecute_process()
should be used.For all other parameters and documentation see:
execute_process()
Availability: Unix (streaming), Windows (blocking)
Utility Functions¶
Currently there is only one utility function, a Python implementation of the which
shell command.
-
osrf_pycommon.process_utils.
which
(cmd, mode=1, path=None, **kwargs)[source]¶ Given a command, mode, and a PATH string, return the path which conforms to the given mode on the PATH, or None if there is no such file.
mode defaults to
os.F_OK | os.X_OK
. path defaults to the result ofos.environ.get("PATH")
, or can be overridden with a custom search path.Backported from
shutil.which()
(https://docs.python.org/3.3/library/shutil.html#shutil.which), available in Python 3.3.