Linguist 278: Programming for Linguists
Stanford Linguistics, Fall 2021
Christopher Potts
The subprocess library allows you to include calls to other command-line utilities inside your Python programs.
import subprocess
def notebook2html(filename):
"""Converts a notebook file to HTML using `nbconvert`
Parameters
----------
filename : str
Full path to the file to create.
Writes
------
A new file with the same basename as `filename`, but
with the `.ipynb` extension replaced by `.html`.
"""
cmd = ["jupyter", "nbconvert", "--to", "html", filename]
subprocess.run(cmd)
notebook2html("ling278_class17_subprocess.ipynb")
import subprocess
def capture_ls(dirname="."):
"""Use the `ls` utility to list the contents of `dirname`, then parse that
output into a list.
Parameters
----------
dirname : str
Directory to list. Default is the current directory.
Returns
-------
list of str
"""
cmd = ["ls", dirname]
proc = subprocess.run(cmd, stdout=subprocess.PIPE)
b = proc.stdout
return b.decode('utf8').splitlines()
capture_ls()
The pandoc library has a lot of functionality for converting between documents of different format. Check out the demos. For the most part, the syntax of the commands is uniform and pandoc will use the file extension to infer what type the input is and what type the output should be. So we can use subprocess to write a basic converter:
def convert_with_pandoc(src_filename, output_filename):
cmd = ["pandoc", "-s", src_filename, "-o", output_filename]
subprocess.run(cmd)
convert_with_pandoc("ling278_class17_subprocess.html", "ling278_class17_subprocess.docx")