-
Notifications
You must be signed in to change notification settings - Fork 89
Add automated notebook testing with Papermill #602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Introduces a GitHub Actions workflow to run and validate Jupyter notebooks in CI using a new runner script. Adds scripts to mock PyMC sampling for faster execution, updates test dependencies to include papermill, and documents the notebook runner usage. Also updates the interrogate badge to reflect new coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces automated testing for Jupyter notebooks in CI using Papermill. The implementation includes a runner script that mocks PyMC's MCMC sampling with faster prior predictive sampling to validate notebooks execute without errors.
Key changes:
- New notebook runner script with filtering capabilities for different notebook types
- Mock PyMC sampling implementation that replaces expensive MCMC with prior predictive sampling (10 draws)
- GitHub Actions workflow that runs notebooks in parallel across three categories (PyMC, sklearn, and other notebooks)
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/run_notebooks/runner.py | Main script for executing notebooks with Papermill, includes filtering and logging |
| scripts/run_notebooks/injected.py | Mock implementation of pm.sample that uses prior predictive sampling |
| scripts/run_notebooks/README.md | Documentation for the notebook runner usage and CI integration |
| .github/workflows/test_notebook.yml | GitHub Actions workflow for parallel notebook testing |
| pyproject.toml | Adds papermill to test dependencies |
| docs/source/_static/interrogate_badge.svg | Updates documentation coverage badge from 96.3% to 96.0% |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| with NamedTemporaryFile(suffix=".ipynb", delete=False) as f: | ||
| write_ipynb(nb, f.name) | ||
| try: | ||
| papermill.execute_notebook( | ||
| input_path=f.name, | ||
| output_path=None, # Discard output | ||
| kernel_name=KERNEL_NAME, | ||
| progress_bar=True, | ||
| cwd=notebook_path.parent, | ||
| ) | ||
| except Exception as e: | ||
| logging.error(f"Error running notebook: {notebook_path.name}") | ||
| raise e | ||
|
|
||
|
|
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The temporary file created by NamedTemporaryFile is never deleted because delete=False is set, but there's no cleanup code to remove it after use. This will leave temporary files on the filesystem after each notebook run.
| with NamedTemporaryFile(suffix=".ipynb", delete=False) as f: | |
| write_ipynb(nb, f.name) | |
| try: | |
| papermill.execute_notebook( | |
| input_path=f.name, | |
| output_path=None, # Discard output | |
| kernel_name=KERNEL_NAME, | |
| progress_bar=True, | |
| cwd=notebook_path.parent, | |
| ) | |
| except Exception as e: | |
| logging.error(f"Error running notebook: {notebook_path.name}") | |
| raise e | |
| temp_path: Path | None = None | |
| try: | |
| with NamedTemporaryFile(suffix=".ipynb", delete=False) as f: | |
| temp_path = Path(f.name) | |
| write_ipynb(nb, f.name) | |
| papermill.execute_notebook( | |
| input_path=str(temp_path), | |
| output_path=None, # Discard output | |
| kernel_name=KERNEL_NAME, | |
| progress_bar=True, | |
| cwd=notebook_path.parent, | |
| ) | |
| except Exception as e: | |
| logging.error(f"Error running notebook: {notebook_path.name}") | |
| raise e | |
| finally: | |
| if temp_path is not None: | |
| try: | |
| temp_path.unlink(missing_ok=True) | |
| except OSError as cleanup_error: | |
| logging.warning( | |
| "Failed to delete temporary notebook file %s: %s", | |
| temp_path, | |
| cleanup_error, | |
| ) |
| except Exception as e: | ||
| logging.error(f"Error running notebook: {notebook_path.name}") | ||
| raise e |
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is caught, logged, and then re-raised, but the temporary file created on line 69 won't be cleaned up when an exception occurs. Consider using a try-finally block or pathlib's unlink() to ensure cleanup.
| def mock_sample(*args, **kwargs): | ||
| """Mock pm.sample using prior predictive sampling for speed.""" | ||
| random_seed = kwargs.get("random_seed") | ||
| model = kwargs.get("model") |
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mock_sample function doesn't handle the case when 'model' is not provided in kwargs. If pm.sample is called with a positional model argument or without a model in the current context, this will raise a KeyError or TypeError.
| model = kwargs.get("model") | |
| model = kwargs.get("model") | |
| # If no model is provided via kwargs, try to infer it from positional args | |
| if model is None and args: | |
| first_arg = args[0] | |
| if isinstance(first_arg, pm.Model): | |
| model = first_arg |
| samples = 10 | ||
|
|
||
| idata = pm.sample_prior_predictive( | ||
| model=model, | ||
| random_seed=random_seed, | ||
| draws=samples, | ||
| ) | ||
| idata.add_groups(posterior=idata.prior) | ||
|
|
||
| # Create mock sample stats with diverging data | ||
| if "sample_stats" not in idata: | ||
| n_chains = 1 | ||
| n_draws = samples |
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name 'samples' is misleading. In MCMC terminology, this represents the number of 'draws', not 'samples' (which typically refers to chains × draws). Consider renaming to 'draws' or 'n_draws' for clarity.
| samples = 10 | |
| idata = pm.sample_prior_predictive( | |
| model=model, | |
| random_seed=random_seed, | |
| draws=samples, | |
| ) | |
| idata.add_groups(posterior=idata.prior) | |
| # Create mock sample stats with diverging data | |
| if "sample_stats" not in idata: | |
| n_chains = 1 | |
| n_draws = samples | |
| n_draws = 10 | |
| idata = pm.sample_prior_predictive( | |
| model=model, | |
| random_seed=random_seed, | |
| draws=n_draws, | |
| ) | |
| idata.add_groups(posterior=idata.prior) | |
| # Create mock sample stats with diverging data | |
| if "sample_stats" not in idata: | |
| n_chains = 1 |
|
|
||
| ## How It Works | ||
|
|
||
| 1. **Mocks `pm.sample()`** — Replaces MCMC sampling with prior predictive (10 draws) for speed |
Copilot
AI
Dec 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states the mock uses "10 draws" but doesn't explain that this happens with only 1 chain. For users debugging test failures, it would be helpful to mention both the number of chains and draws (e.g., "1 chain × 10 draws").
| 1. **Mocks `pm.sample()`** — Replaces MCMC sampling with prior predictive (10 draws) for speed | |
| 1. **Mocks `pm.sample()`** — Replaces MCMC sampling with prior predictive (1 chain × 10 draws) for speed |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #602 +/- ##
=======================================
Coverage 93.27% 93.27%
=======================================
Files 37 37
Lines 5632 5632
Branches 367 367
=======================================
Hits 5253 5253
Misses 248 248
Partials 131 131 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Introduces a GitHub Actions workflow to run and validate Jupyter notebooks in CI using a new runner script. Adds scripts to mock PyMC sampling for faster execution, updates test dependencies to include papermill, and documents the notebook runner usage. Also updates the interrogate badge to reflect new coverage.
📚 Documentation preview 📚: https://causalpy--602.org.readthedocs.build/en/602/