Skip to content

In cases where multiple sets produce the same minimum pvalue, all of those should be included in the empirical dist #13

@cmatKhan

Description

@cmatKhan

When performing the bootstrap, it is the case that multiple combinations of the hypergeometric parameters can produce the same minimum pvalue. In that case, all of those minimum pvalues should be returned and added to the empirical distribution.

Ex

This is our empirical distribution -- note that duplicate values have been included, in this case b/c in a single bootstrap iteration, there were multiple 'minimum' pvalues. This is possible b/c different combinations of the hypergeometric parameters can produce the same pvalue

> # Simulated minimum p-values from bootstrap samples
> empirical_distribution <- c(0.01, 0.01, 0.01, 0.02, 0.03, 0.03, 0.04, 0.05, 0.10, 0.15)

We set a observed pvalue

> # Observed p-value from the original data
> observed_p <- 0.01

And calculate the empirical pvalue by comparing the observed_p to the empirical distribution, including the duplicate minimum pvalues

> # Case 1: Including all ties
> empirical_all <- mean(empirical_distribution <= observed_p)

And in case 2, we mimic what we're currently doing in DTO by choosing just 1 of those pvalues from each iteration

> # Case 2: Including only one of each tied min (simulate duplicate exclusion)
> # We'll assume we only count one of the 0.01s
> empirical_unique <- mean(unique(empirical_distribution) <= observed_p)

The result is a different empirical pvalue

> cat("Observed p-value: ", observed_p, "\n")
Observed p-value:  0.01 
> cat("Empirical p-value (all ties):   ", empirical_all, "\n")
Empirical p-value (all ties):    0.3 
> cat("Empirical p-value (one per tie):", empirical_unique, "\n")
Empirical p-value (one per tie): 0.1428571

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions