Skip to content

Conversation

@maresb
Copy link
Contributor

@maresb maresb commented Dec 15, 2025

Description

Disclosure: With assistance from claude-4.5-opus-high via Cursor.

This PR adds a _logccdf (log complementary CDF / log survival function) dispatcher to fix numerical instability when computing log-probabilities for censored distributions at extreme tail values.

The Problem:
For right-censored distributions, the log-probability at the upper bound requires computing log(1 - CDF). The existing implementation uses log1mexp(logcdf), which breaks down when CDF ≈ 1 (far right tail). For example, a censored Normal(0, 1) at 40 standard deviations returns -inf instead of the correct ≈ -804.6.

The Solution:
Add a _logccdf dispatcher that allows distributions to provide a numerically stable log survival function. For Normal, this uses the existing normal_lccdf function (based on erfcx) which is stable across the entire domain.

Changes:

  • pymc/logprob/abstract.py: Add _logccdf singledispatch and _logccdf_helper
  • pymc/distributions/distribution.py: Register logccdf methods via metaclass
  • pymc/distributions/continuous.py: Add logccdf to Normal using stable normal_lccdf
  • pymc/logprob/censoring.py: Use _logccdf for right-censored distributions when available
  • pymc/logprob/binary.py: Use _logccdf for comparison operations
  • pymc/logprob/transforms.py: Use _logccdf_helper for monotonic transforms
  • pymc/logprob/basic.py: Add public logccdf() function
  • pymc/logprob/__init__.py: Export logccdf

Related Issue

Checklist

Type of change

  • New feature / enhancement
  • Bug fix
  • Documentation
  • Maintenance
  • Other (please specify):

Test that pm.Censored computes log-probabilities stably at the bounds:
- Right censoring (upper bound): log(1 - CDF) when CDF ≈ 1
- Left censoring (lower bound): log(CDF) when CDF ≈ 0

Uses pm.Censored with Normal(0, 1) at ±40 standard deviations.
Add _logccdf (log complementary CDF / log survival function) support:

- pymc/logprob/abstract.py: Add _logccdf singledispatch and _logccdf_helper
- pymc/distributions/distribution.py: Register logccdf methods via metaclass
- pymc/distributions/continuous.py: Add logccdf to Normal using stable normal_lccdf
- pymc/logprob/censoring.py: Use _logccdf for right-censored distributions
- pymc/logprob/binary.py: Use _logccdf for comparison operations
- pymc/logprob/transforms.py: Use _logccdf_helper for monotonic transforms
- pymc/logprob/basic.py: Add public logccdf() function
- pymc/logprob/__init__.py: Export logccdf

This fixes numerical instability when computing log-probabilities for
censored Normal distributions at extreme tail values (e.g., 10+ sigma).
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a _logccdf dispatcher to provide numerically stable log survival function (log complementary CDF) computations for censored distributions and other operations that require computing log(1 - CDF). The implementation fixes numerical instability issues when evaluating at extreme tail values, such as computing the log-probability of a censored Normal(0,1) distribution at 40 standard deviations, which previously returned -inf instead of the correct value.

Key Changes

  • Added _logccdf singledispatch function and _logccdf_helper for numerically stable log survival function computation
  • Implemented logccdf method for the Normal distribution using the existing normal_lccdf function
  • Updated censoring, binary comparison, and transform modules to use _logccdf when available, with graceful fallback to pt.log1mexp(logcdf)

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
pymc/logprob/abstract.py Added _logccdf dispatcher and _logccdf_helper following the same pattern as _logcdf
pymc/distributions/distribution.py Added metaclass registration for logccdf methods to automatically dispatch to distribution-specific implementations
pymc/distributions/continuous.py Implemented logccdf for Normal distribution using stable normal_lccdf function
pymc/logprob/censoring.py Updated right-censored distribution logic to use _logccdf with fallback for numerical stability
pymc/logprob/binary.py Updated comparison operations to use _logccdf for improved numerical stability
pymc/logprob/transforms.py Updated monotonic transforms to use _logccdf_helper for continuous distributions
pymc/logprob/basic.py Added public logccdf() function with comprehensive documentation and examples
pymc/logprob/__init__.py Exported logccdf as part of the public API
tests/logprob/test_censoring.py Added parametrized tests for numerical stability of censored distributions at extreme tail values
tests/logprob/test_abstract.py Added comprehensive tests for _logccdf_helper, public logccdf function, and numerical stability

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

This test uses pm.Censored which is the high-level API for censored distributions.
"""
import pymc as pm
Copy link

Copilot AI Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Module 'pymc' is imported with both 'import' and 'import from'.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Moved import pymc as pm to top-level imports.

19b9979

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this would always be needed eventually. I left a request. Also can we use this in the Truncated, or it doesn't show up there?

raise NotImplementedError(f"LogCCDF method not implemented for {op}")


def _logccdf_helper(rv, value, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this method do the try except fallback to log1mexp? So users/devs don't need to do it all the time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! The helper now tries _logccdf first and automatically falls back to log1mexp(logcdf) if not implemented. Callers no longer need to handle the exception.

15e5f64

@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

❌ Patch coverage is 97.95918% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 91.51%. Comparing base (cadb97a) to head (36b8672).

Files with missing lines Patch % Lines
pymc/logprob/basic.py 92.30% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #7996      +/-   ##
==========================================
+ Coverage   90.22%   91.51%   +1.28%     
==========================================
  Files         116      116              
  Lines       18972    19013      +41     
==========================================
+ Hits        17117    17399     +282     
+ Misses       1855     1614     -241     
Files with missing lines Coverage Δ
pymc/distributions/continuous.py 98.24% <100.00%> (+0.35%) ⬆️
pymc/distributions/distribution.py 94.50% <100.00%> (+0.13%) ⬆️
pymc/distributions/truncated.py 99.46% <100.00%> (+0.01%) ⬆️
pymc/logprob/__init__.py 100.00% <ø> (ø)
pymc/logprob/abstract.py 97.50% <100.00%> (+0.44%) ⬆️
pymc/logprob/binary.py 96.34% <100.00%> (ø)
pymc/logprob/censoring.py 97.67% <100.00%> (ø)
pymc/logprob/transforms.py 95.43% <100.00%> (ø)
pymc/logprob/basic.py 99.24% <92.30%> (-0.76%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

maresb added a commit to maresb/celeri that referenced this pull request Dec 15, 2025
Import censored_distribution_stability_hotfix when PyMC is available.
This fixes numerical instability in pm.Censored for Normal distributions
when values are in the extreme tails.

Addresses: brendanjmeade#341
Upstream fix: pymc-devs/pymc#7996
Centralizes the fallback logic so callers don't need to handle it.
The helper now tries the stable _logccdf first and automatically
falls back to log1mexp(logcdf) if not implemented.
Uses stable logccdf for computing log(1 - CDF(lower)) in truncated_logprob
and truncated_logcdf instead of the potentially unstable log1mexp(logcdf).
The construct_ir_fgraph returns a single FunctionGraph, not a tuple.
Extract ir_valued_rv from outputs and unpack ir_rv and ir_value from inputs.
Move 'import pymc as pm' to top of file with other imports
instead of inside the test function.
@maresb
Copy link
Contributor Author

maresb commented Dec 15, 2025

Yeah this would always be needed eventually. I left a request. Also can we use this in the Truncated, or it doesn't show up there?

Yes! Truncated computes log(1 - CDF(lower)) via pt.log1mexp(lower_logcdf). Added _create_lower_logccdf_expr helper to TruncatedRV and updated both truncated_logprob and truncated_logcdf to use stable logccdf.

15806c0

@maresb
Copy link
Contributor Author

maresb commented Dec 15, 2025

Working on test coverage...

@ricardoV94
Copy link
Member

I'll only be able to review later, but the general picture seems great. How urgent is this to get merged?

@maresb
Copy link
Contributor Author

maresb commented Dec 15, 2025

Not super urgent since we have a workaround: brendanjmeade/celeri#343

maresb added a commit to maresb/celeri that referenced this pull request Dec 15, 2025
…tributions"

This reverts commit 3edaae2.

This will be safe once celeri requires a release of PyMC where
pymc-devs/pymc#7996 has been merged.
The test_logccdf_numerical_stability test already covers this functionality
at the public API level. The helper test was redundant since it tested the
same numerical stability property through _logccdf_helper.
Verifies that distributions without a registered _logccdf method
(e.g., Uniform) use the log1mexp(logcdf) fallback, while distributions
with _logccdf (e.g., Normal) use their specialized implementation.

The test inspects the computation graph structure rather than just
numerical results to ensure the correct code path is exercised.
Increase test bounds from 10/40 to 100 sigma to future-proof against
any potential improvements in naive computation methods. At 100 sigma,
CDF(100) is truly indistinguishable from 1.0 in float64.

Also enhances test docstrings with What/Why/How documentation.
Add detailed docstrings to logccdf tests explaining:
- What: what the test verifies
- Why: motivation and edge cases being tested
- How: the testing methodology
Tests that pm.logccdf works when the random variable depends on
transformed parameters, triggering the construct_ir_fgraph fallback
path in the public logccdf function.
Tests that logccdf registration works correctly for custom Distribution
subclasses using SymbolicRandomVariable with extended_signature, which
exercises the params_idxs code path in DistributionMeta.
@maresb maresb force-pushed the log-ccdf-stability branch from b770a67 to 36b8672 Compare December 16, 2025 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants