🌌 Comprehensive Documentation and Implementation Guide for nbconvert and papermil
🌟 1. Introduction
🚀 Welcome to this comprehensive guide! This section will give you the foundational knowledge you need.
nbconvert is a pivotal tool within the Jupyter ecosystem, primarily serving to transform Jupyter notebooks, which are inherently interactive environments, into a diverse array of static output formats 1. This capability addresses the common need to share, publish, or present the content of these notebooks in formats that are more suitable for different audiences or platforms 1. The fundamental principle behind nbconvert is to render the dynamic and computational aspects of a notebook into a fixed, presentable form. For instance, researchers might use nbconvert to generate PDF versions of their notebooks for inclusion in academic publications, while data scientists could convert notebooks to HTML for easy sharing via the web or embedding in blog posts 1.
papermil, on the other hand, is specifically designed for the programmatic execution and parameterization of Jupyter notebooks 3. Its core strength lies in enabling the automation of notebook-based workflows, a crucial feature for tasks such as batch analysis of data, the scheduled generation of reports, and systematic experimentation in fields like data science and machine learning 3. Instead of requiring manual intervention to run notebooks with varying configurations, papermil allows users to define parameters and then execute notebooks through scripts or other automated processes 3. In summary, while nbconvert focuses on the transformation of notebooks into static formats for dissemination, papermil centers on the automated and parameterized execution of notebooks for computational tasks 1. Together, they provide a powerful set of tools for managing and leveraging Jupyter notebooks in a wide range of applications.
nbconvert supports an extensive list of conversion targets, including HTML, LaTeX, PDF, Markdown, reStructuredText, Reveal.js slideshows for presentations, and even the conversion to executable scripts in various programming languages 1. This broad support underscores its role as a highly adaptable solution for diverse output requirements. Conversely, papermil offers robust mechanisms for parameterizing notebooks using cell tags, command-line arguments, and external files such as YAML. This parameterization is coupled with the ability to execute these notebooks programmatically through both a command-line interface and a Python API 3.
🌟 2. In-depth Exploration of nbconvert
- 2.1 Functionalities:
- Conversion Options and Supported Formats:
-
HTML conversion with nbconvert allows for the creation of static web pages from Jupyter notebooks, offering several built-in templates to control the visual appearance 6. The lab template, which is the default, produces output that closely resembles the JupyterLab interface. The classic template provides a look and feel reminiscent of the traditional Jupyter Notebook interface. For users needing a more minimal structure, the basic template offers a stripped-down HTML output suitable for embedding in other web pages 6. Additionally, nbconvert provides the option to embed images directly into the HTML file as base64 encoded URLs, which can be useful for creating self-contained web pages 6.
-
LaTeX conversion enables the generation of .tex files, which are the source files for LaTeX documents 1. These files can then be processed using LaTeX engines to produce high-quality documents, often used for academic publications and reports with complex formatting requirements 6.
nbconvert offers templates for LaTeX output as well, including article, which is the default and is derived from Sphinx’s how-to template, report, which includes features like a table of contents and chapters, and basic, which provides a minimal LaTeX structure intended as a starting point for custom templates 6.
- PDF conversion is a crucial functionality, allowing Jupyter notebooks to be transformed into the widely accepted Portable Document Format 1. This is often achieved by first converting the notebook to LaTeX using nbconvert and then utilizing a LaTeX engine such as XeTeX to compile the .tex file into a PDF 6.
nbconvert also supports an alternative method for PDF conversion called WebPDF, which leverages a headless browser, specifically Chromium, through the Playwright library 6. This approach renders the notebook in a browser-like environment and then saves it as a PDF. The choice between these two methods offers flexibility; LaTeX-based conversion is often preferred for documents with complex formatting and mathematical content, while WebPDF can be more suitable for notebooks containing dynamic JavaScript-rendered elements 11.
-
Markdown conversion allows for the transformation of Jupyter notebooks into plain Markdown files 1. Markdown is a lightweight markup language that uses plain text formatting syntax, making it easy to read and write. This format is widely supported across various platforms and is commonly used for creating documentation, README files, and content for websites and applications that understand Markdown 1.
-
nbconvert also supports conversion to reStructuredText (RST) format 1. RST is the default markup language used by Sphinx, a popular tool for generating technical documentation, particularly within the Python ecosystem 12. By converting notebooks to RST, users can seamlessly include their notebook-based analyses, explanations, and code examples into larger documentation projects built with Sphinx.
-
For presentation purposes, nbconvert can generate interactive HTML-based slideshows using the Reveal.js JavaScript framework 1. This functionality allows users to create dynamic presentations directly from their Jupyter notebooks. The tool enables the customization of the slideshow through cell metadata, allowing users to define slide breaks, sub-slides, fragments, and speaker notes 6.
-
nbconvert can also transform a Jupyter notebook into an executable script in the language of the notebook’s kernel, such as Python 1. This process extracts the code cells from the notebook and saves them as a .py file or a similar script format, making it possible to run the code outside of the Jupyter environment 6. This is particularly useful for automation purposes or for deploying notebook-based code in environments where a full Jupyter installation might not be available.
-
Beyond converting to external formats, nbconvert offers the capability to operate on notebook files themselves 6. It allows users to run its preprocessors on a notebook, which can be used for tasks like executing the code in-place or stripping out outputs. Additionally, nbconvert can convert notebooks between different versions of the .ipynb format, which might be necessary for compatibility with older systems or specific requirements 6.
- Programmatic Conversion: nbconvert can be utilized as a Python library within other Python projects 1. This allows developers to integrate notebook conversion functionalities into their custom applications or automated workflows. For example, the “Download as” feature within the Jupyter Notebook web application itself uses nbconvert in this manner to provide users with various conversion options 1.
- 2.2 Installation and Setup:
- nbconvert can be installed using either pip, the Python package installer, via the command pip install nbconvert, or using conda, the package and environment management system, with the command conda install nbconvert 10. Both methods provide a straightforward way to add nbconvert to your Python environment.
- To unlock its full potential, nbconvert relies on several external dependencies 10. Pandoc is a crucial dependency for handling conversions involving Markdown and reStructuredText, and it also plays a role in some LaTeX and PDF conversion pathways 1. A TeX distribution, specifically including the XeLaTeX engine, is necessary for converting notebooks to PDF using the —to pdf option, as nbconvert leverages XeTeX for its enhanced Unicode and font support 1. For the —to webpdf option, which allows for converting notebooks to PDF via a headless browser, nbconvert requires the Playwright library, which in turn automates the Chromium browser 1.
- 2.3 Architecture and Internal Mechanisms:
- The process of converting a Jupyter notebook with nbconvert follows a structured pipeline 1. This pipeline begins with the loading of the notebook file, which is typically a JSON document. The content of the notebook then undergoes a series of transformations managed by preprocessors. Following this, the transformed content is rendered into the desired output format using templates and filters. The resulting output is then written to a file or stream by writers. Finally, postprocessors may be applied to perform additional actions on the output 15.
- The primary component in nbconvert’s architecture is the Exporter 15. For each supported output format, there is a specific exporter class responsible for orchestrating the conversion. For instance, the HTMLExporter handles conversion to HTML, the LaTeXExporter to LaTeX, and the PDFExporter to PDF. Exporters load the notebook, configure and select the appropriate Preprocessors, Filters, and Templates, and ultimately return the converted data and any associated resources. Preprocessors are objects that transform the notebook’s content before it is rendered 1. These transformations can include executing code cells (using ExecutePreprocessor), stripping output from cells, or removing specific cells based on metadata tags 15. Most exporters, particularly those generating formats other than notebooks, utilize Templates based on the Jinja templating engine 6. Templates define the structure and styling of the output format and often operate at the level of individual notebook cells, accessing cell metadata and using filters to transform content. Filters are Python callables that take an input, typically text, and produce a transformed output 15. They are used within Jinja templates to perform custom transformations on specific parts of the notebook content, such as syntax highlighting of code or conversion of ANSI escape codes to HTML. Writers handle the task of taking the final rendered output and writing it to a destination 15.
nbconvert provides a FilesWriter for writing to the filesystem and a StdoutWriter for writing to the standard output. Finally, Postprocessors are components that run after the output has been written 15.
- 2.4 Code Implementation Examples:
- Command-line Usage:
-
To convert a Jupyter notebook named mynotebook.ipynb to an HTML file, the command is $ jupyter nbconvert —to html mynotebook.ipynb 2. This will generate a file named mynotebook.html in the same directory as the original notebook.
-
For converting to PDF, assuming the necessary TeX installation is in place, the command is $ jupyter nbconvert —to pdf mynotebook.ipynb 2. This will produce a PDF file named mynotebook.pdf.
-
To convert the notebook to a Markdown file, the command is $ jupyter nbconvert —to markdown mynotebook.ipynb 2, resulting in mynotebook.md.
-
Generating an executable Python script from the notebook is done with the command $ jupyter nbconvert —to script mynotebook.ipynb 2, which creates mynotebook.py.
-
To create a Reveal.js slideshow, you would use $ jupyter nbconvert —to slides mynotebook.ipynb —reveal-prefix reveal.js 6. The —reveal-prefix flag specifies the location of the Reveal.js library.
-
Executing a notebook and saving the output back to the same file can be achieved using $ jupyter nbconvert —to notebook —execute mynotebook.ipynb —inplace 6.
-
To use a specific template, such as the classic template for HTML conversion, the command would be $ jupyter nbconvert —to html —template classic mynotebook.ipynb 6.
- Python API Usage:
-
A basic example of using the Python API to convert a notebook to HTML involves importing the necessary modules, reading the notebook file, instantiating the HTMLExporter, processing the notebook, and then writing the resulting HTML to a file. The code snippet would look like this: Python from nbconvert import HTMLExporter import nbformat with open(“mynotebook.ipynb”) as f: notebook = nbformat.read(f, as_version=4) html_exporter = HTMLExporter() (body, resources) = html_exporter.from_notebook_node(notebook) with open(“mynotebook.html”, “w”) as f: f.write(body) This illustrates how to programmatically control the conversion process, offering more flexibility for integration into larger Python applications.
-
Programmatic execution of a notebook using the API involves importing nbformat and ExecutePreprocessor: Python import nbformat from nbconvert.preprocessors import ExecutePreprocessor with open(“mynotebook.ipynb”) as f: nb = nbformat.read(f, as_version=4) ep = ExecutePreprocessor(timeout=600, kernel_name=‘python3’) ep.preprocess(nb, {‘metadata’: {‘path’: ’.’}}) with open(‘executed_notebook.ipynb’, ‘w’, encoding=‘utf-8’) as f: nbformat.write(nb, f) This demonstrates how to execute the code cells within a notebook programmatically and save the executed notebook.
- Customizing Templates:
- To customize the appearance of the output, users can create their own templates. For example, to make all markdown cells have a yellow background in the HTML output, a custom template file named make_it_pop.tpl could be created with the following content 13: Django {% extends ‘full.tpl’ %} {% block markdown_cell %}
- 2.5 Developer Insights and Best Practices:
- Developers often encounter issues with PDF conversion, primarily due to missing or improperly configured dependencies such as Pandoc and TeX (especially XeLaTeX) 10. The error message “xelatex not found on PATH” is a common indicator of such problems 21. Ensuring these tools are correctly installed and accessible is crucial for successful PDF generation.
- The WebPDF exporter, which relies on a headless browser, has been known to hang indefinitely, particularly on Linux systems 25. This is frequently attributed to sandbox mode restrictions within Chromium. A potential workaround involves running nbconvert with the flags —allow-chromium-download —no-sandbox 25, although users should be aware of the security implications of disabling the sandbox.
- Notebooks with intricate layouts or complex styling might not always translate perfectly to all output formats, potentially leading to rendering discrepancies 11. Users with highly stylized notebooks might need to experiment with different templates or simplify their notebook design to achieve the desired output.
- Exporting notebooks containing embedded images or SVG graphics can sometimes result in errors or incorrect rendering in formats like LaTeX and PDF 21. This can be due to format compatibility issues or limitations in the underlying conversion tools. For SVG images, ensuring the necessary LaTeX packages are installed or considering conversion to a more universally supported format like PNG might be necessary 26.
- Jupyter Widgets might not always render correctly when notebooks are executed programmatically using nbconvert —execute 27. This is often related to the handling of widget state during the execution process. Developers might need to explore specific preprocessors or template customizations to address this.
- The documentation for custom templates can sometimes become outdated, leading to potential issues when using templates created for older versions of nbconvert 21. It is advisable to consult the documentation for the specific version of nbconvert being used and to be prepared to update custom templates as needed. To mitigate these challenges, it is recommended to ensure all dependencies are correctly installed and configured according to the official documentation 10. Upgrading nbconvert itself can sometimes resolve issues 24. For WebPDF issues on Linux, using the —no-sandbox flag (with caution) might help 25. Experimenting with different built-in templates or creating custom ones can address complex layout or styling needs 13. For image and SVG issues, verifying LaTeX configurations or using alternative image formats might be necessary 26. When working with widgets, exploring solutions for saving and restoring widget states during programmatic execution is advised 28. Finally, always refer to the documentation relevant to your nbconvert version when using or creating custom templates 21. Best practices for using nbconvert include keeping notebooks clean and well-structured to ensure easier conversion. Testing conversions to all intended output formats early in the development process is also crucial. Utilizing version control for notebooks and custom templates is recommended. When encountering issues, consulting the official documentation and community forums can provide valuable assistance 2.
🌟 3. In-depth Exploration of papermil
- 3.1 Functionalities:
- Notebook Parameterization:
-
The fundamental way to parameterize a Jupyter notebook using papermil is by designating a specific code cell with the tag parameters 3. This cell typically contains variable assignments that represent the default values for the parameters that a user intends to override when executing the notebook with papermil. This tagging can be done through the Jupyter Notebook or JupyterLab interface by adding the tag “parameters” to the cell’s metadata.
-
papermil provides several flexible ways to override these default parameter values at the time of execution. One common method is using command-line arguments. The -p or —parameters flag allows users to specify parameter name-value pairs, and papermil will attempt to infer the data type of the provided value (e.g., if the value looks like a number or a boolean).
Alternatively, the -r or —parameters_raw flag can be used to treat the provided value as a raw string, regardless of its content 3.
- For scenarios involving a larger number of parameters or parameters with complex data structures, papermil supports reading parameters from external YAML files using the -f or —parameters_file flag. Users can also directly provide parameters as YAML-formatted strings via the -y or —parameters_yaml flag. Additionally, for situations where data might need to be encoded, papermil allows parameters to be passed as base64-encoded YAML strings using the -b or —parameters_base64 flag 4.
- Notebook Execution:
-
papermil can execute Jupyter notebooks directly from the command line. The basic command structure involves invoking papermil followed by the path to the input notebook, the desired path for the output notebook, and any parameter overrides using the flags mentioned above 3. This command-line interface provides a straightforward way to automate notebook execution from scripts or other command-line tools.
-
For deeper integration into Python-based workflows, papermil offers a Python API. The core function for programmatic execution is papermil.execute_notebook(), which takes the path to the input notebook, the path where the executed notebook should be saved, and an optional dictionary containing the parameters to be passed to the notebook 4. This API allows developers to embed notebook execution logic within their Python applications and scripts.
-
papermil is designed to work with notebooks stored in various locations 4. It supports local file paths, HTTP and HTTPS URLs for accessing notebooks hosted on web servers, and integration with popular cloud storage services such as Amazon S3, Azure DataLake Store and Blob Store, and Google Cloud Storage. This broad support for different input and output path handlers makes papermil highly adaptable to different deployment environments and data storage strategies.
- Injected-parameters Cell: A key aspect of how papermil handles parameterization is the insertion of a new code cell tagged injected-parameters 4. This cell is automatically added immediately after the cell tagged parameters. The injected-parameters cell contains Python code that assigns the parameter values provided at execution time to the corresponding variables. This mechanism provides a clear and auditable record within the executed notebook of the exact parameters used for that specific run.
- 3.2 Installation and Setup:
- Installing papermil is a simple process using pip, the standard Python package installer. The command pip install papermill will download and install the latest stable version of papermil and its core dependencies 3.
- papermil offers optional dependencies that provide support for various input/output operations and other features 4. These can be installed using “extras” with pip. For example, pip install papermill[all] will install all optional dependencies, including those for cloud storage (like S3, Azure, and GCS) and other functionalities.
- papermil has specific requirements regarding the version of Python it supports 3. Currently, it is compatible with Python 3.7 or later, or Python 3.8 or later, depending on the specific version of papermil. Users should ensure that their Python environment meets these requirements to avoid compatibility issues.
- 3.3 Architecture and Internal Mechanisms:
- The core of papermil’s functionality lies in its parameter injection mechanism 4. When papermil executes a notebook, it first scans the notebook for a cell that has been tagged with parameters. This tag signals to papermil that this cell contains the default values for parameters that might be overridden at runtime. Once this cell is identified (or if no such cell exists, in which case the top of the notebook is used), papermil inserts a new code cell immediately after it. This newly inserted cell is automatically tagged with injected-parameters and contains Python code that assigns the parameter values provided during execution (either via the command line or the Python API) to variables with the same names as those in the parameters cell.
- papermil employs an execution engine that operates without the need for a full Jupyter Notebook server or a graphical user interface 37. When papermil is invoked to execute a notebook, it launches a Jupyter kernel as a separate subprocess in the background.
papermil then acts as a client, communicating with this kernel to execute the notebook’s cells in a sequential manner.
- papermil is built upon a component-based architecture that promotes extensibility and flexibility 37. This design allows different parts of the library, such as the handlers for input and output operations (which manage how papermil interacts with various file systems and storage services), to be overwritten or extended. Users can develop and register their own custom handlers to add support for new types of storage or to customize the way papermil interacts with existing ones. This registration mechanism, often facilitated through Python’s setuptools entry points, allows the papermil library to be adapted and expanded without requiring modifications to its core code.
- 3.4 Code Implementation Examples:
- Command-line Parameterization and Execution:
-
A basic example of executing a notebook named input.ipynb, saving the output to output.ipynb, and setting the parameters alpha to 0.6 and l1_ratio to 0.1 is: $ papermil input.ipynb output.ipynb -p alpha 0.6 -p l1_ratio 0.1 3.
-
To pass a parameter version as a raw string with the value “1.0”, the command would be: $ papermil input.ipynb output.ipynb -r version “1.0” 4.
-
If parameters are defined in a YAML file named parameters.yaml, you can execute the notebook with these parameters using: $ papermil input.ipynb output.ipynb -f parameters.yaml 4.
-
Parameters can also be passed directly as a YAML string on the command line: $ papermil input.ipynb output.ipynb -y “alpha: 0.6\nl1_ratio: 0.1” (Note the newline character \n) 4.
-
To execute a notebook located in an Amazon S3 bucket, saving the output back to S3, and setting a parameter param to value, the command might look like: $ papermil s3://my-bucket/input.ipynb s3://my-bucket/output.ipynb -p param value (Ensure AWS credentials are properly configured) 4.
- Python API Usage:
-
To execute a notebook programmatically with parameters, you would import the papermil library and use the execute_notebook function: Python import papermil as pm pm.execute_notebook( ‘input.ipynb’, ‘output.ipynb’, parameters = {‘alpha’: 0.6, ‘ratio’: 0.1} ) This code snippet executes input.ipynb, saves the result to output.ipynb, and sets the parameters alpha and ratio to the specified values.
-
You can also specify a different output path: Python import papermil as pm pm.execute_notebook( ‘input.ipynb’, ‘outputs/executed_notebook.ipynb’, parameters = {‘data_path’: ‘/path/to/data.csv’} ) This example saves the executed notebook to a subdirectory named outputs.
- Parameterizing a Notebook:
- To parameterize a notebook, open it in JupyterLab or Jupyter Notebook. Select the cell where you want to define the default parameter values. In JupyterLab, navigate to View -> Cell Toolbar -> Tags, and then add the tag parameters to the cell 5. A typical parameters-tagged cell might contain code like: Python
🌌 Parameters
region = “US” device_type = “PC” These are the default values that can be overridden when the notebook is executed using papermil.
- 3.5 Developer Insights and Best Practices:
- A common challenge developers face with papermil is ensuring that the correct Jupyter kernel is available and selected for notebook execution 39. Especially in environments with multiple kernels, specifying the right one is crucial to avoid errors like “No such kernel named python3” 39. It is advisable to explicitly set the kernel in the notebook’s metadata or ensure the execution environment has the expected kernel.
- Handling errors during automated notebook execution is also a key consideration 19. By default, papermil will halt execution and raise an exception if any cell in the notebook fails. Implementing robust error handling within the notebook using try-except blocks or leveraging papermil’s logging capabilities can help manage these situations gracefully.
- Managing sensitive information, such as API keys or passwords, as papermil parameters poses a security risk 41. Passing such secrets as plain text, especially via command-line arguments, is not recommended. Developers should explore more secure methods like using environment variables or dedicated secret management solutions.
- While there have been discussions about the ability to conditionally skip cell execution in papermil based on tags 42, this is not a built-in feature. Developers seeking this functionality might need to implement custom logic within their notebooks, perhaps using parameters to control the execution of specific code blocks.
- Ensuring that all necessary Python packages and dependencies are available in the execution environment is vital for papermil to run notebooks without issues 39. Using virtual environments or containerization can help create consistent and reproducible environments.
- Parameter parsing in papermil might have limitations in certain cases, such as with strings containing special characters 44. If unexpected behavior occurs, trying the —parameters_raw option or encoding parameters in YAML format might resolve the issue. To address these challenges, developers should explicitly manage the Jupyter kernel, implement error handling in their notebooks, avoid passing secrets as plain parameters, and carefully manage their execution environments. When parsing issues arise, experimenting with different parameter passing methods is recommended.papermil integrates seamlessly with Apache Airflow through the PapermillOperator 33.
🌟 4. Advanced Use Cases and Combined Implementation
-
Automated Report Generation: An advanced application of papermil and nbconvert involves automating the generation of reports. A template Jupyter notebook is created, containing the core logic for data retrieval, analysis, and visualization. Dynamic elements within the report, such as specific dates or product categories, are defined as parameters using the parameters tag. papermil then executes this template multiple times, each time with a different set of parameters, generating a series of executed notebooks. Finally, nbconvert transforms these executed notebooks into final report formats like PDF or HTML for distribution. For instance, a daily sales report template could be executed for each sales region using papermil, with the region and date as parameters, and then nbconvert could convert each executed notebook into a regional PDF report.
-
Machine Learning Workflow Automation: In machine learning, tuning hyperparameters often requires running numerous experiments with different parameter values. papermil can automate this by executing an experiment notebook multiple times, each with a different hyperparameter configuration. After these executions, nbconvert can be used to generate a summary report of the experiment results. This might involve another Jupyter notebook that analyzes the output notebooks from papermil, extracts performance metrics, and then uses nbconvert to create an HTML or PDF report visualizing the results and highlighting the best performing configurations.
-
Creating Dynamic Documentation: papermil and nbconvert can also be used to create dynamic documentation. Documentation content, such as code examples or API usage instructions, can be maintained in executable Jupyter notebooks.
papermil can execute these notebooks, possibly with parameters controlling the version of the software being documented or specific features. The executed notebooks, containing up-to-date and verified content, can then be converted to the desired documentation format using nbconvert. For example, code examples for a software library could be stored in notebooks and executed with papermil for a specific library version.
🌟 5. Conclusion and Recommendations
nbconvert and papermil are powerful tools that significantly enhance the utility of Jupyter notebooks.
nbconvert provides a versatile solution for converting notebooks into a wide array of static formats, making notebook content accessible for various purposes such as sharing, publishing, and presenting. Its flexible architecture and template system allow for extensive customization of the output. On the other hand, papermil enables the programmatic execution and parameterization of Jupyter notebooks, which is crucial for automating workflows, conducting systematic experiments, and building robust data pipelines. The combination of nbconvert and papermil unlocks advanced use cases, allowing for the automation of complex tasks such as report generation, machine learning experiments, and the creation of dynamic documentation. By understanding the functionalities, architecture, and best practices of both tools, developers and data scientists can significantly enhance their productivity and the reproducibility of their work. For developers using nbconvert, it is recommended to carefully consider the requirements of the target output format and to experiment with the available templates and options to achieve the desired presentation. When working with PDF conversion, ensuring that all necessary dependencies like Pandoc and TeX are correctly installed is vital. For WebPDF, be aware of potential platform-specific issues and use the —no-sandbox flag judiciously on Linux. When using papermil, leverage the parameterization features to create reusable notebook templates and choose the appropriate method for passing parameters based on the complexity and security requirements of the task. For both tools, consulting the official documentation and engaging with the community are invaluable resources for staying informed about best practices and troubleshooting common issues.
⚡ Key Tables:
1. nbconvert Supported Output Formats and Use Cases:
Output Format | Description | Common Use Cases | Relevant Snippets |
---|---|---|---|
HTML | Static web page | Web sharing, embedding in blogs/websites | 1 |
LaTeX | Typesetting format | Academic publications, reports with complex mathematics | 1 |
Portable Document Format | Final reports, sharing documents with consistent formatting | 1 | |
Markdown | Lightweight markup language | Documentation, content for Markdown-supporting platforms | 1 |
reStructuredText (RST) | Markup language used by Sphinx | Technical documentation generation with Sphinx | 1 |
Reveal.js Slideshow | Interactive HTML presentations | Presentations directly from notebooks | 1 |
Executable Script | Python or other language script | Automation, running notebook code outside Jupyter | 1 |
Notebook | Allows running preprocessors or converting to different notebook versions | Programmatic manipulation of notebooks | 6 |
1. papermil Parameterization Methods:
Method | Description | Syntax/Example | Relevant Snippets |
---|---|---|---|
parameters tag | Designating a cell with the “parameters” tag in the notebook | Add “tags”: [“parameters”] to cell metadata or use the JupyterLab tag interface. | 3 |
-p / —parameters | Passing parameters via command-line arguments (attempts type inference) | $ papermil input.ipynb output.ipynb -p alpha 0.6 -p l1_ratio 0.1 | 3 |
-r / —parameters_raw | Passing parameters as raw strings via command-line arguments | $ papermil input.ipynb output.ipynb -r version “1.0” | 4 |
-f / —parameters_file | Providing parameters through a YAML file | $ papermil input.ipynb output.ipynb -f parameters.yaml (where parameters.yaml contains parameter definitions). | 4 |
-y / —parameters_yaml | Passing parameters directly as a YAML string via command-line arguments | $ papermil input.ipynb output.ipynb -y “alpha: 0.6\nl1_ratio: 0.1” | 4 |
-b / —parameters_base64 | Passing base64-encoded YAML parameters via command-line arguments | $ papermil input.ipynb output.ipynb -b “YWxwaGE6IDAuNgpsMV9yYXRpbzogMC4xCg==“ | 4 |
🔧 Works cited
1. nbconvert: Convert Notebooks to other formats — nbconvert 7.16.6 documentation, accessed on March 23, 2025, https://nbconvert.readthedocs.io/
2. jupyter/nbconvert: Jupyter Notebook Conversion - GitHub, accessed on March 23, 2025, https://github.com/jupyter/nbconvert
3. Home - papermill 2.4.0 documentation, accessed on March 23, 2025, https://papermill.readthedocs.io/
4. nteract/papermill: Parameterize, execute, and analyze notebooks - GitHub, accessed on March 23, 2025, https://github.com/nteract/papermill
5. Papermill - NERSC Documentation, accessed on March 23, 2025, https://docs.nersc.gov/jobs/workflow/papermill/
6. Using as a command line tool — nbconvert 7.16.6 documentation, accessed on March 23, 2025, https://nbconvert.readthedocs.io/en/latest/usage.html
7. Using as a command line tool — nbconvert 5.2.1 documentation - Read the Docs, accessed on March 23, 2025, https://nbconvert.readthedocs.io/en/5.2.1/usage.html
8. Parameterizing and automating Jupyter notebooks with papermill …, accessed on March 23, 2025, https://www.wrighters.io/parameters-jupyter-notebooks-with-papermill/
9. Parameterize Notebooks in Azure Data Studio With Papermill …, accessed on March 23, 2025, https://learn.microsoft.com/en-us/azure-data-studio/notebooks/parameterize-papermill
10. Installation — nbconvert 7.16.6 documentation, accessed on March 23, 2025, https://nbconvert.readthedocs.io/en/latest/install.html
11. Converting Jupyter notebooks to PDF - Ploomber, accessed on March 23, 2025, https://ploomber.io/blog/jupyter-notebook-convert/
12. Using as a command line tool — nbconvert 5.2.1 documentation, accessed on March 23, 2025, https://nbconvert.readthedocs.io/en/5.2.1/usage.html?highlight=re
13. Thomas Kluyver & Min Ragan Kelley - Customising nbconvert how to turn Jupyter notebooks into anythi - YouTube, accessed on March 23, 2025, https://www.youtube.com/watch?v=9d7ZQWUIPyw
14. nbconvert - Jupyter Tutorial 24.1.0, accessed on March 23, 2025, https://jupyter-tutorial.readthedocs.io/en/24.1.0/nbconvert.html
15. Architecture of nbconvert - Read the Docs, accessed on March 23, 2025, https://nbconvert.readthedocs.io/en/latest/architecture.html
16. nbconvert/docs/source/nbconvert_library.ipynb at main · jupyter …, accessed on March 23, 2025, https://github.com/jupyter/nbconvert/blob/main/docs/source/nbconvert_library.ipynb
17. Architecture of nbconvert — nbconvert 4.3.0 documentation - Read the Docs, accessed on March 23, 2025, https://nbconvert.readthedocs.io/en/4.3.0/architecture.html
18. The templating system of nbconvert 6 | by Sylvain Corlay | Jupyter Blog, accessed on March 23, 2025, https://blog.jupyter.org/the-templating-system-of-nbconvert-6-47ea781eacd2
19. Executing notebooks — nbconvert 7.16.6 documentation, accessed on March 23, 2025, https://nbconvert.readthedocs.io/en/latest/execute_api.html
20. Converting Jupyter Notebooks to PDF with nbconvert: A Step-by-Step Guide, accessed on March 23, 2025, https://tariqueakhtar-39220.medium.com/converting-jupyter-notebooks-to-pdf-with-nbconvert-a-step-by-step-guide-2948c2792fbc
21. Issues · jupyter/nbconvert - GitHub, accessed on March 23, 2025, https://github.com/jupyter/nbconvert/issues
22. Frequent ‘nbconvert’ Questions - Stack Overflow, accessed on March 23, 2025, https://stackoverflow.com/questions/tagged/nbconvert?tab=Frequent
23. Newest ‘nbconvert’ Questions - Stack Overflow, accessed on March 23, 2025, https://stackoverflow.com/questions/tagged/nbconvert
24. 500: internal server error with Jupyter Notebook (nbconvert updated) - Stack Overflow, accessed on March 23, 2025, https://stackoverflow.com/questions/75614143/500-internal-server-error-with-jupyter-notebook-nbconvert-updated
25. nbconvert hangs indefinitely when trying to use webpdf export on Linux #1834 - GitHub, accessed on March 23, 2025, https://github.com/jupyter/nbconvert/issues/1834
26. SVG in jupyter conversion with nbconvert fails, accessed on March 23, 2025, https://discourse.jupyter.org/t/svg-in-jupyter-conversion-with-nbconvert-fails/2724
27. Widgets not displaying properly in HTML · Issue #923 · jupyter/nbconvert - GitHub, accessed on March 23, 2025, https://github.com/jupyter/nbconvert/issues/923
28. Some widgets do not work with nbconvert --execute
· Issue #2004 - GitHub, accessed on March 23, 2025, https://github.com/jupyter/nbconvert/issues/2004
29. Issues · jupyter/nbconvert-examples - GitHub, accessed on March 23, 2025, https://github.com/jupyter/nbconvert-examples/issues
30. Jupyter, which nbconvert template does what? - Stack Overflow, accessed on March 23, 2025, https://stackoverflow.com/questions/46448873/jupyter-which-nbconvert-template-does-what
31. papermill · PyPI, accessed on March 23, 2025, https://pypi.org/project/papermill/0.11.5/
32. Parameterize - papermill 2.4.0 documentation, accessed on March 23, 2025, https://papermill.readthedocs.io/en/latest/usage-parameterize.html
33. Papermill — apache-airflow-providers-papermill Documentation, accessed on March 23, 2025, https://airflow.apache.org/docs/apache-airflow-providers-papermill/stable/operators.html
34. Papermill — Airflow Documentation, accessed on March 23, 2025, https://airflow.apache.org/docs/apache-airflow/1.10.8/howto/operator/papermill.html
35. Matthew Seal: Data and ETL with Notebooks in Papermill | PyData …, accessed on March 23, 2025, https://www.youtube.com/watch?v=7ER9tqiNack
36. Matthew Seal - Programmatic Notebooks with papermill - PyCon …, accessed on March 23, 2025, https://www.youtube.com/watch?v=vBEEL274sco
37. Notebooks as Functions with Papermill | Netflix - YouTube, accessed on March 23, 2025, https://www.youtube.com/watch?v=3FmBJ847_y8
38. esds/posts/2022/batch-processing-notebooks-with-papermill.md at main - GitHub, accessed on March 23, 2025, https://github.com/NCAR/esds/blob/main//posts/2022/batch-processing-notebooks-with-papermill.md
39. Newest ‘papermill’ Questions - Stack Overflow, accessed on March 23, 2025, https://stackoverflow.com/questions/tagged/papermill
40. Recently Active ‘papermill’ Questions - Stack Overflow, accessed on March 23, 2025, https://stackoverflow.com/questions/tagged/papermill?sort=active
41. Handling secrets · Issue #271 · nteract/papermill - GitHub, accessed on March 23, 2025, https://github.com/nteract/papermill/issues/271
42. Papermill skip cells execution · Issue #494 - GitHub, accessed on March 23, 2025, https://github.com/nteract/papermill/issues/494
43. Jupyter Notebook Manifesto: Best practices that can improve the life of any developer using … - Google Cloud, accessed on March 23, 2025, https://cloud.google.com/blog/products/ai-machine-learning/best-practices-that-can-improve-the-life-of-any-developer-using-jupyter-notebooks
44. Papermill fails when injecting parameters · Issue #553 - GitHub, accessed on March 23, 2025, https://github.com/nteract/papermill/issues/553
45. apache-airflow-providers-papermill package, accessed on March 23, 2025, https://airflow.apache.org/docs/apache-airflow-providers-papermill/stable/index.html