Add automated notebook testing with Papermill #602

drbenvincent · 2025-12-20T21:52:30Z

Introduces a GitHub Actions workflow to run and validate Jupyter notebooks in CI using a new runner script. Adds scripts to mock PyMC sampling for faster execution, updates test dependencies to include papermill, and documents the notebook runner usage. Also updates the interrogate badge to reflect new coverage.

📚 Documentation preview 📚: https://causalpy--602.org.readthedocs.build/en/602/

Introduces a GitHub Actions workflow to run and validate Jupyter notebooks in CI using a new runner script. Adds scripts to mock PyMC sampling for faster execution, updates test dependencies to include papermill, and documents the notebook runner usage. Also updates the interrogate badge to reflect new coverage.

Copilot

Pull request overview

This PR introduces automated testing for Jupyter notebooks in CI using Papermill. The implementation includes a runner script that mocks PyMC's MCMC sampling with faster prior predictive sampling to validate notebooks execute without errors.

Key changes:

New notebook runner script with filtering capabilities for different notebook types
Mock PyMC sampling implementation that replaces expensive MCMC with prior predictive sampling (10 draws)
GitHub Actions workflow that runs notebooks in parallel across three categories (PyMC, sklearn, and other notebooks)

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
scripts/run_notebooks/runner.py	Main script for executing notebooks with Papermill, includes filtering and logging
scripts/run_notebooks/injected.py	Mock implementation of pm.sample that uses prior predictive sampling
scripts/run_notebooks/README.md	Documentation for the notebook runner usage and CI integration
.github/workflows/test_notebook.yml	GitHub Actions workflow for parallel notebook testing
pyproject.toml	Adds papermill to test dependencies
docs/source/_static/interrogate_badge.svg	Updates documentation coverage badge from 96.3% to 96.0%

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-20T21:55:10Z

scripts/run_notebooks/runner.py

+    with NamedTemporaryFile(suffix=".ipynb", delete=False) as f:
+        write_ipynb(nb, f.name)
+        try:
+            papermill.execute_notebook(
+                input_path=f.name,
+                output_path=None,  # Discard output
+                kernel_name=KERNEL_NAME,
+                progress_bar=True,
+                cwd=notebook_path.parent,
+            )
+        except Exception as e:
+            logging.error(f"Error running notebook: {notebook_path.name}")
+            raise e
+
+


The temporary file created by NamedTemporaryFile is never deleted because delete=False is set, but there's no cleanup code to remove it after use. This will leave temporary files on the filesystem after each notebook run.

Suggested change

with NamedTemporaryFile(suffix=".ipynb", delete=False) as f:

write_ipynb(nb, f.name)

try:

papermill.execute_notebook(

input_path=f.name,

output_path=None, # Discard output

kernel_name=KERNEL_NAME,

progress_bar=True,

cwd=notebook_path.parent,

)

except Exception as e:

logging.error(f"Error running notebook: {notebook_path.name}")

raise e

temp_path: Path | None = None

try:

with NamedTemporaryFile(suffix=".ipynb", delete=False) as f:

temp_path = Path(f.name)

write_ipynb(nb, f.name)

papermill.execute_notebook(

input_path=str(temp_path),

output_path=None, # Discard output

kernel_name=KERNEL_NAME,

progress_bar=True,

cwd=notebook_path.parent,

)

except Exception as e:

logging.error(f"Error running notebook: {notebook_path.name}")

raise e

finally:

if temp_path is not None:

try:

temp_path.unlink(missing_ok=True)

except OSError as cleanup_error:

logging.warning(

"Failed to delete temporary notebook file %s: %s",

temp_path,

cleanup_error,

)

Copilot · 2025-12-20T21:55:10Z

scripts/run_notebooks/runner.py

+        except Exception as e:
+            logging.error(f"Error running notebook: {notebook_path.name}")
+            raise e


The error is caught, logged, and then re-raised, but the temporary file created on line 69 won't be cleaned up when an exception occurs. Consider using a try-finally block or pathlib's unlink() to ensure cleanup.

Copilot · 2025-12-20T21:55:10Z

scripts/run_notebooks/injected.py

+def mock_sample(*args, **kwargs):
+    """Mock pm.sample using prior predictive sampling for speed."""
+    random_seed = kwargs.get("random_seed")
+    model = kwargs.get("model")


The mock_sample function doesn't handle the case when 'model' is not provided in kwargs. If pm.sample is called with a positional model argument or without a model in the current context, this will raise a KeyError or TypeError.

Suggested change

model = kwargs.get("model")

model = kwargs.get("model")

# If no model is provided via kwargs, try to infer it from positional args

if model is None and args:

first_arg = args[0]

if isinstance(first_arg, pm.Model):

model = first_arg

Copilot · 2025-12-20T21:55:11Z

scripts/run_notebooks/injected.py

+    samples = 10
+
+    idata = pm.sample_prior_predictive(
+        model=model,
+        random_seed=random_seed,
+        draws=samples,
+    )
+    idata.add_groups(posterior=idata.prior)
+
+    # Create mock sample stats with diverging data
+    if "sample_stats" not in idata:
+        n_chains = 1
+        n_draws = samples


The variable name 'samples' is misleading. In MCMC terminology, this represents the number of 'draws', not 'samples' (which typically refers to chains × draws). Consider renaming to 'draws' or 'n_draws' for clarity.

Suggested change

samples = 10

idata = pm.sample_prior_predictive(

model=model,

random_seed=random_seed,

draws=samples,

)

idata.add_groups(posterior=idata.prior)

# Create mock sample stats with diverging data

if "sample_stats" not in idata:

n_chains = 1

n_draws = samples

n_draws = 10

idata = pm.sample_prior_predictive(

model=model,

random_seed=random_seed,

draws=n_draws,

)

idata.add_groups(posterior=idata.prior)

# Create mock sample stats with diverging data

if "sample_stats" not in idata:

n_chains = 1

Copilot · 2025-12-20T21:55:11Z

scripts/run_notebooks/README.md

+
+## How It Works
+
+1. **Mocks `pm.sample()`** — Replaces MCMC sampling with prior predictive (10 draws) for speed


The documentation states the mock uses "10 draws" but doesn't explain that this happens with only 1 chain. For users debugging test failures, it would be helpful to mention both the number of chains and draws (e.g., "1 chain × 10 draws").

Suggested change

1. **Mocks `pm.sample()`** — Replaces MCMC sampling with prior predictive (10 draws) for speed

1. **Mocks `pm.sample()`** — Replaces MCMC sampling with prior predictive (1 chain × 10 draws) for speed

codecov · 2025-12-20T22:00:54Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.27%. Comparing base (2d6bba7) to head (f569f81).

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #602   +/-   ##
=======================================
  Coverage   93.27%   93.27%           
=======================================
  Files          37       37           
  Lines        5632     5632           
  Branches      367      367           
=======================================
  Hits         5253     5253           
  Misses        248      248           
  Partials      131      131

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

drbenvincent requested review from Copilot and williambdean December 20, 2025 21:52

drbenvincent added documentation Improvements or additions to documentation devops DevOps related labels Dec 20, 2025

Copilot started reviewing on behalf of drbenvincent December 20, 2025 21:52 View session

Copilot AI reviewed Dec 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add automated notebook testing with Papermill #602

Add automated notebook testing with Papermill #602

drbenvincent commented Dec 20, 2025 •

edited by github-actions bot

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

Copilot AI Dec 20, 2025

Uh oh!

codecov bot commented Dec 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    model = kwargs.get("model")
+    model = kwargs.get("model")
+    # If no model is provided via kwargs, try to infer it from positional args
+    if model is None and args:
+        first_arg = args[0]
+        if isinstance(first_arg, pm.Model):
+            model = first_arg


		## How It Works

		1. Mocks `pm.sample()` — Replaces MCMC sampling with prior predictive (10 draws) for speed

Add automated notebook testing with Papermill #602

Are you sure you want to change the base?

Add automated notebook testing with Papermill #602

Conversation

drbenvincent commented Dec 20, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 20, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 20, 2025

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

drbenvincent commented Dec 20, 2025 •

edited by github-actions bot

Loading