Adding stage1 model output #84

cmatKhan · 2025-12-01T21:42:54Z

This closes #82 .

It additionally adds documentation of the output in the docs

…pytest to CI

…rentLab#31) * intermediate * propogating bootstrappedmodelinginputdata changes to __main__ * removed unweighted bootstrap options * removing top_n as argparse option from step3 sigmoid parser * adding center_scale to argparse * setting drop_intercept to True permanently for sigmoid worker * the sigmoid parameters must have args.drop_intercept still * handling intercepts * fixing typo in center_scale logging * changing the way the formula is logged * removing truncation from formula logging * adding logging on random_state in bootstrappedmodelinput * removing sample weight cv log * removing sample weight cv logging

…rentLab#37) * loop exits if no variable selected within the loop, fixes issue 34 * fixing linter issues * linter issues --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]>

…e minimal test case size (BrentLab#54) * saving changes for remove_bin_by_binding * Add check on number of features and increase minimal test case size

* create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface * separate the functions and objects in lasso_modeling.py * Refactor main by adding interface.py (BrentLab#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * create interface.py to separate main move the logging configuration back into main fixing interface create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface rebasing refactor onto dev create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface adding name == main to main script separate the functions and objects in lasso_modeling.py * renaming loop_modeling * separate the tests out into files * separate the tests out into files * fixing evaluate_interactor_significance_lassocv --------- Co-authored-by: chasem <[email protected]>

codecov · 2025-12-01T21:45:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.19%. Comparing base (d49887e) to head (380320b).
⚠️ Report is 18 commits behind head on dev.

Additional details and impacted files

@@            Coverage Diff             @@
##              dev      #84      +/-   ##
==========================================
+ Coverage   71.79%   72.19%   +0.40%     
==========================================
  Files          13       13              
  Lines         826      838      +12     
  Branches      116      116              
==========================================
+ Hits          593      605      +12     
  Misses        174      174              
  Partials       59       59

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copilot

Pull request overview

This PR adds the ability to save a "best all data model" (Stage 3) after extracting significant coefficients from bootstrap modeling. It addresses issue #82 by creating a fitted model that can be used for predictions, and includes comprehensive documentation of all output files.

Key changes:

Adds new Stage 3 that creates and saves a best-fit model (best_all_data_model.pkl) using stratified cross-validation on significant coefficients
Updates step numbering in logs to reflect the new stage (Steps 3-5 renumbered to Steps 3-6)
Adds comprehensive output documentation (docs/output.md) describing all output files, their formats, and usage examples

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File	Description
tfbpmodeling/interface.py	Adds Stage 3 model training/saving logic with joblib, updates step numbering in logs
tfbpmodeling/tests/test_interface.py	Updates test fixtures to support new Stage 3 functionality (DummyModel, joblib stub, updated log assertions)
docs/output.md	New comprehensive documentation describing all output files, formats, and usage examples for the modeling pipeline

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

docs/output.md

tfbpmodeling/interface.py

docs/output.md

Co-authored-by: Copilot <[email protected]>

* updating the tmp/readme * updating precommit * fixing cmd line interface in __main__ * updating typing on sigmoid fit * adding logging re how CV is performed * initial topn modeling implementation. This will store the results in a separate output directory to differentiate from all data modeling * initial attempt of stage 3 of modeling. This seems to run without issues based on limited testing. * stepwise modelling * this separates the sigmoid step 3 into its own function; reoganizes cmd line input into reusable groups; sets the evaluate_interactor_significance estimator to LinearRegression by default * removing windows 2019 from CI * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * with pre-commit * removing islice from bootstrap loop * removing fixtures to conftest; fixing spacing in loop module; adding pytest to CI * only running tests on ubuntu * trying to configure codecov * debugging codecov * debugging codecov * removing codecov badge and updating the pytest badge * adding the codecov badge back in -- secret corrected in the repo * debugging codecov * still debugging codecov * Add cubic ptf and standardization (#29) * init * After installing pre-commit * add center and scaling * changing the names for center scaling * Update tfbpmodeling/__main__.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * debugging the names for scale and center * editing annotatioins --------- Co-authored-by: Chase Mateusiak <[email protected]> * Set random state on bootstraps; Remove unweighted bootstrap option (#31) * intermediate * propogating bootstrappedmodelinginputdata changes to __main__ * removed unweighted bootstrap options * removing top_n as argparse option from step3 sigmoid parser * adding center_scale to argparse * setting drop_intercept to True permanently for sigmoid worker * the sigmoid parameters must have args.drop_intercept still * handling intercepts * fixing typo in center_scale logging * changing the way the formula is logged * removing truncation from formula logging * adding logging on random_state in bootstrappedmodelinput * removing sample weight cv log * removing sample weight cv logging * Add function to exclude all model variables (#33) * adding function to exclude all predictors. this can be used to exclude all and then use args.add_model_variables to customize the formula * casting predictor_variables to list * fixing centering and scaling (#36) * loop exits if no variable selected within the loop, fixes issue 34 (#37) * loop exits if no variable selected within the loop, fixes issue 34 * fixing linter issues * linter issues --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * setting evaluate_interactor_significance ci_level to the parse args argument * Tommy new stage3 (#41) * WIP * fixing imported but not used * adding argument to include stage4_lasso * changing formatting problems * adding logger infomation for stage 4 method * changing location of logger info for stage 4 method * modifying logger.info to evaluate_interactor_significance * adding f string to Writing the final interactor significance results to {output_significance_file} * fixing logging --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: chasem <[email protected]> * fixing error in stratification_classification that reversed the bin_by_binding_only param (#44) * adding feature stage4_topn (#43) * adding feature stage4_topn * modifying argument parser * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * aligning stratified_cv_r2 * restoring changes to evaluate_interactor_significance_linear * commit after fixing error in stratification_classification * attempt to fixing inconsistent numbers of samples * fixing pr --------- Co-authored-by: chasem <[email protected]> * Add row max in interactor significance (#48) * setting row max depending on model variables * setting row max depending on model variables * adding testing on log for evaluate_interactor_significance * removing ptf from all data formula by default (#51) * Remove bin by binding and Add check on number of features and increase minimal test case size (#54) * saving changes for remove_bin_by_binding * Add check on number of features and increase minimal test case size * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * Separate the functions and objects in lasso_modeling.py (#58) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface * separate the functions and objects in lasso_modeling.py * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * create interface.py to separate main move the logging configuration back into main fixing interface create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface rebasing refactor onto dev create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface adding name == main to main script separate the functions and objects in lasso_modeling.py * renaming loop_modeling * separate the tests out into files * separate the tests out into files * fixing evaluate_interactor_significance_lassocv --------- Co-authored-by: chasem <[email protected]> * removing scale_center from interface (#60) * Adding documentation (#78) * preparing for paper release (#77) * updating the tmp/readme * updating precommit * fixing cmd line interface in __main__ * updating typing on sigmoid fit * adding logging re how CV is performed * initial topn modeling implementation. This will store the results in a separate output directory to differentiate from all data modeling * initial attempt of stage 3 of modeling. This seems to run without issues based on limited testing. * stepwise modelling * this separates the sigmoid step 3 into its own function; reoganizes cmd line input into reusable groups; sets the evaluate_interactor_significance estimator to LinearRegression by default * removing windows 2019 from CI * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * with pre-commit * removing islice from bootstrap loop * removing fixtures to conftest; fixing spacing in loop module; adding pytest to CI * only running tests on ubuntu * trying to configure codecov * debugging codecov * debugging codecov * removing codecov badge and updating the pytest badge * adding the codecov badge back in -- secret corrected in the repo * debugging codecov * still debugging codecov * Add cubic ptf and standardization (#29) * init * After installing pre-commit * add center and scaling * changing the names for center scaling * Update tfbpmodeling/__main__.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * debugging the names for scale and center * editing annotatioins --------- Co-authored-by: Chase Mateusiak <[email protected]> * Set random state on bootstraps; Remove unweighted bootstrap option (#31) * intermediate * propogating bootstrappedmodelinginputdata changes to __main__ * removed unweighted bootstrap options * removing top_n as argparse option from step3 sigmoid parser * adding center_scale to argparse * setting drop_intercept to True permanently for sigmoid worker * the sigmoid parameters must have args.drop_intercept still * handling intercepts * fixing typo in center_scale logging * changing the way the formula is logged * removing truncation from formula logging * adding logging on random_state in bootstrappedmodelinput * removing sample weight cv log * removing sample weight cv logging * Add function to exclude all model variables (#33) * adding function to exclude all predictors. this can be used to exclude all and then use args.add_model_variables to customize the formula * casting predictor_variables to list * fixing centering and scaling (#36) * loop exits if no variable selected within the loop, fixes issue 34 (#37) * loop exits if no variable selected within the loop, fixes issue 34 * fixing linter issues * linter issues --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * setting evaluate_interactor_significance ci_level to the parse args argument * Tommy new stage3 (#41) * WIP * fixing imported but not used * adding argument to include stage4_lasso * changing formatting problems * adding logger infomation for stage 4 method * changing location of logger info for stage 4 method * modifying logger.info to evaluate_interactor_significance * adding f string to Writing the final interactor significance results to {output_significance_file} * fixing logging --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: chasem <[email protected]> * fixing error in stratification_classification that reversed the bin_by_binding_only param (#44) * adding feature stage4_topn (#43) * adding feature stage4_topn * modifying argument parser * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * aligning stratified_cv_r2 * restoring changes to evaluate_interactor_significance_linear * commit after fixing error in stratification_classification * attempt to fixing inconsistent numbers of samples * fixing pr --------- Co-authored-by: chasem <[email protected]> * Add row max in interactor significance (#48) * setting row max depending on model variables * setting row max depending on model variables * adding testing on log for evaluate_interactor_significance * removing ptf from all data formula by default (#51) * Remove bin by binding and Add check on number of features and increase minimal test case size (#54) * saving changes for remove_bin_by_binding * Add check on number of features and increase minimal test case size * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * Separate the functions and objects in lasso_modeling.py (#58) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface * separate the functions and objects in lasso_modeling.py * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * create interface.py to separate main move the logging configuration back into main fixing interface create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface rebasing refactor onto dev create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface adding name == main to main script separate the functions and objects in lasso_modeling.py * renaming loop_modeling * separate the tests out into files * separate the tests out into files * fixing evaluate_interactor_significance_lassocv --------- Co-authored-by: chasem <[email protected]> * removing scale_center from interface (#60) --------- Co-authored-by: ejiawustl <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: ezolbooe <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: 17TML <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * claude developed docs --------- Co-authored-by: ejiawustl <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: ezolbooe <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: 17TML <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * Fix estimator comment in interface (#83) * preparing for paper release (#77) * updating the tmp/readme * updating precommit * fixing cmd line interface in __main__ * updating typing on sigmoid fit * adding logging re how CV is performed * initial topn modeling implementation. This will store the results in a separate output directory to differentiate from all data modeling * initial attempt of stage 3 of modeling. This seems to run without issues based on limited testing. * stepwise modelling * this separates the sigmoid step 3 into its own function; reoganizes cmd line input into reusable groups; sets the evaluate_interactor_significance estimator to LinearRegression by default * removing windows 2019 from CI * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * with pre-commit * removing islice from bootstrap loop * removing fixtures to conftest; fixing spacing in loop module; adding pytest to CI * only running tests on ubuntu * trying to configure codecov * debugging codecov * debugging codecov * removing codecov badge and updating the pytest badge * adding the codecov badge back in -- secret corrected in the repo * debugging codecov * still debugging codecov * Add cubic ptf and standardization (#29) * init * After installing pre-commit * add center and scaling * changing the names for center scaling * Update tfbpmodeling/__main__.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * debugging the names for scale and center * editing annotatioins --------- Co-authored-by: Chase Mateusiak <[email protected]> * Set random state on bootstraps; Remove unweighted bootstrap option (#31) * intermediate * propogating bootstrappedmodelinginputdata changes to __main__ * removed unweighted bootstrap options * removing top_n as argparse option from step3 sigmoid parser * adding center_scale to argparse * setting drop_intercept to True permanently for sigmoid worker * the sigmoid parameters must have args.drop_intercept still * handling intercepts * fixing typo in center_scale logging * changing the way the formula is logged * removing truncation from formula logging * adding logging on random_state in bootstrappedmodelinput * removing sample weight cv log * removing sample weight cv logging * Add function to exclude all model variables (#33) * adding function to exclude all predictors. this can be used to exclude all and then use args.add_model_variables to customize the formula * casting predictor_variables to list * fixing centering and scaling (#36) * loop exits if no variable selected within the loop, fixes issue 34 (#37) * loop exits if no variable selected within the loop, fixes issue 34 * fixing linter issues * linter issues --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * setting evaluate_interactor_significance ci_level to the parse args argument * Tommy new stage3 (#41) * WIP * fixing imported but not used * adding argument to include stage4_lasso * changing formatting problems * adding logger infomation for stage 4 method * changing location of logger info for stage 4 method * modifying logger.info to evaluate_interactor_significance * adding f string to Writing the final interactor significance results to {output_significance_file} * fixing logging --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: chasem <[email protected]> * fixing error in stratification_classification that reversed the bin_by_binding_only param (#44) * adding feature stage4_topn (#43) * adding feature stage4_topn * modifying argument parser * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * aligning stratified_cv_r2 * restoring changes to evaluate_interactor_significance_linear * commit after fixing error in stratification_classification * attempt to fixing inconsistent numbers of samples * fixing pr --------- Co-authored-by: chasem <[email protected]> * Add row max in interactor significance (#48) * setting row max depending on model variables * setting row max depending on model variables * adding testing on log for evaluate_interactor_significance * removing ptf from all data formula by default (#51) * Remove bin by binding and Add check on number of features and increase minimal test case size (#54) * saving changes for remove_bin_by_binding * Add check on number of features and increase minimal test case size * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * Separate the functions and objects in lasso_modeling.py (#58) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface * separate the functions and objects in lasso_modeling.py * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * create interface.py to separate main move the logging configuration back into main fixing interface create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface rebasing refactor onto dev create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface adding name == main to main script separate the functions and objects in lasso_modeling.py * renaming loop_modeling * separate the tests out into files * separate the tests out into files * fixing evaluate_interactor_significance_lassocv --------- Co-authored-by: chasem <[email protected]> * removing scale_center from interface (#60) --------- Co-authored-by: ejiawustl <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: ezolbooe <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: 17TML <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * claude developed docs * removing a old comment * updating documentation to reflect that the centering option is removed * removing a old comment * updating documentation to reflect that the centering option is removed --------- Co-authored-by: ejiawustl <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: ezolbooe <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: 17TML <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * stepwise modelling * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * with pre-commit * removing islice from bootstrap loop * removing fixtures to conftest; fixing spacing in loop module; adding pytest to CI * Set random state on bootstraps; Remove unweighted bootstrap option (#31) * intermediate * propogating bootstrappedmodelinginputdata changes to __main__ * removed unweighted bootstrap options * removing top_n as argparse option from step3 sigmoid parser * adding center_scale to argparse * setting drop_intercept to True permanently for sigmoid worker * the sigmoid parameters must have args.drop_intercept still * handling intercepts * fixing typo in center_scale logging * changing the way the formula is logged * removing truncation from formula logging * adding logging on random_state in bootstrappedmodelinput * removing sample weight cv log * removing sample weight cv logging * loop exits if no variable selected within the loop, fixes issue 34 (#37) * loop exits if no variable selected within the loop, fixes issue 34 * fixing linter issues * linter issues --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * adding feature stage4_topn (#43) * adding feature stage4_topn * modifying argument parser * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * aligning stratified_cv_r2 * restoring changes to evaluate_interactor_significance_linear * commit after fixing error in stratification_classification * attempt to fixing inconsistent numbers of samples * fixing pr --------- Co-authored-by: chasem <[email protected]> * Remove bin by binding and Add check on number of features and increase minimal test case size (#54) * saving changes for remove_bin_by_binding * Add check on number of features and increase minimal test case size * Separate the functions and objects in lasso_modeling.py (#58) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface * separate the functions and objects in lasso_modeling.py * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * create interface.py to separate main move the logging configuration back into main fixing interface create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface rebasing refactor onto dev create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface adding name == main to main script separate the functions and objects in lasso_modeling.py * renaming loop_modeling * separate the tests out into files * separate the tests out into files * fixing evaluate_interactor_significance_lassocv --------- Co-authored-by: chasem <[email protected]> * Fix estimator comment in interface (#83) * preparing for paper release (#77) * updating the tmp/readme * updating precommit * fixing cmd line interface in __main__ * updating typing on sigmoid fit * adding logging re how CV is performed * initial topn modeling implementation. This will store the results in a separate output directory to differentiate from all data modeling * initial attempt of stage 3 of modeling. This seems to run without issues based on limited testing. * stepwise modelling * this separates the sigmoid step 3 into its own function; reoganizes cmd line input into reusable groups; sets the evaluate_interactor_significance estimator to LinearRegression by default * removing windows 2019 from CI * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/loop_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * with pre-commit * removing islice from bootstrap loop * removing fixtures to conftest; fixing spacing in loop module; adding pytest to CI * only running tests on ubuntu * trying to configure codecov * debugging codecov * debugging codecov * removing codecov badge and updating the pytest badge * adding the codecov badge back in -- secret corrected in the repo * debugging codecov * still debugging codecov * Add cubic ptf and standardization (#29) * init * After installing pre-commit * add center and scaling * changing the names for center scaling * Update tfbpmodeling/__main__.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/tests/test_lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * Update tfbpmodeling/lasso_modeling.py Co-authored-by: Chase Mateusiak <[email protected]> * debugging the names for scale and center * editing annotatioins --------- Co-authored-by: Chase Mateusiak <[email protected]> * Set random state on bootstraps; Remove unweighted bootstrap option (#31) * intermediate * propogating bootstrappedmodelinginputdata changes to __main__ * removed unweighted bootstrap options * removing top_n as argparse option from step3 sigmoid parser * adding center_scale to argparse * setting drop_intercept to True permanently for sigmoid worker * the sigmoid parameters must have args.drop_intercept still * handling intercepts * fixing typo in center_scale logging * changing the way the formula is logged * removing truncation from formula logging * adding logging on random_state in bootstrappedmodelinput * removing sample weight cv log * removing sample weight cv logging * Add function to exclude all model variables (#33) * adding function to exclude all predictors. this can be used to exclude all and then use args.add_model_variables to customize the formula * casting predictor_variables to list * fixing centering and scaling (#36) * loop exits if no variable selected within the loop, fixes issue 34 (#37) * loop exits if no variable selected within the loop, fixes issue 34 * fixing linter issues * linter issues --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * setting evaluate_interactor_significance ci_level to the parse args argument * Tommy new stage3 (#41) * WIP * fixing imported but not used * adding argument to include stage4_lasso * changing formatting problems * adding logger infomation for stage 4 method * changing location of logger info for stage 4 method * modifying logger.info to evaluate_interactor_significance * adding f string to Writing the final interactor significance results to {output_significance_file} * fixing logging --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: chasem <[email protected]> * fixing error in stratification_classification that reversed the bin_by_binding_only param (#44) * adding feature stage4_topn (#43) * adding feature stage4_topn * modifying argument parser * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * Align response_df with predictors from get_modeling_data to ensure consistency when top_n_masked is enabled * aligning stratified_cv_r2 * restoring changes to evaluate_interactor_significance_linear * commit after fixing error in stratification_classification * attempt to fixing inconsistent numbers of samples * fixing pr --------- Co-authored-by: chasem <[email protected]> * Add row max in interactor significance (#48) * setting row max depending on model variables * setting row max depending on model variables * adding testing on log for evaluate_interactor_significance * removing ptf from all data formula by default (#51) * Remove bin by binding and Add check on number of features and increase minimal test case size (#54) * saving changes for remove_bin_by_binding * Add check on number of features and increase minimal test case size * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * Separate the functions and objects in lasso_modeling.py (#58) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface * separate the functions and objects in lasso_modeling.py * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * create interface.py to separate main move the logging configuration back into main fixing interface create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface rebasing refactor onto dev create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface adding name == main to main script separate the functions and objects in lasso_modeling.py * renaming loop_modeling * separate the tests out into files * separate the tests out into files * fixing evaluate_interactor_significance_lassocv --------- Co-authored-by: chasem <[email protected]> * removing scale_center from interface (#60) --------- Co-authored-by: ejiawustl <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: ezolbooe <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: 17TML <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * claude developed docs * removing a old comment * updating documentation to reflect that the centering option is removed * removing a old comment * updating documentation to reflect that the centering option is removed --------- Co-authored-by: ejiawustl <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: ezolbooe <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: 17TML <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * removing a rebase error * Adding stage1 model output (#84) * stepwise modelling * with pre-commit * removing fixtures to conftest; fixing spacing in loop module; adding pytest to CI * Set random state on bootstraps; Remove unweighted bootstrap option (#31) * intermediate * propogating bootstrappedmodelinginputdata changes to __main__ * removed unweighted bootstrap options * removing top_n as argparse option from step3 sigmoid parser * adding center_scale to argparse * setting drop_intercept to True permanently for sigmoid worker * the sigmoid parameters must have args.drop_intercept still * handling intercepts * fixing typo in center_scale logging * changing the way the formula is logged * removing truncation from formula logging * adding logging on random_state in bootstrappedmodelinput * removing sample weight cv log * removing sample weight cv logging * loop exits if no variable selected within the loop, fixes issue 34 (#37) * loop exits if no variable selected within the loop, fixes issue 34 * fixing linter issues * linter issues --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> * Remove bin by binding and Add check on number of features and increase minimal test case size (#54) * saving changes for remove_bin_by_binding * Add check on number of features and increase minimal test case size * Separate the functions and objects in lasso_modeling.py (#58) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface * separate the functions and objects in lasso_modeling.py * Refactor main by adding interface.py (#56) * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * create interface.py to separate main * add tests to verify important logging statements * move the logging configuration back into main * fixing interface * removing bin_by_binding_only from test arguments to interface * adding name == main to main script * create interface.py to separate main * move the logging configuration back into main * removing bin_by_binding_only from test arguments to interface * rebasing refactor onto dev * adding calling to main * adding a feature column in test_interface --------- Co-authored-by: chasem <[email protected]> * create interface.py to separate main move the logging configuration back into main fixing interface create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface rebasing refactor onto dev create interface.py to separate main move the logging configuration back into main removing bin_by_binding_only from test arguments to interface adding name == main to main script separate the functions and objects in lasso_modeling.py * renaming loop_modeling * separate the tests out into files * separate the tests out into files * fixing evaluate_interactor_significance_lassocv --------- Co-authored-by: chasem <[email protected]> * adding stage to output best prediction model and output documentation * removing erroneous parameter group after merge * adding output.md to mkdocs.yml * Update tfbpmodeling/interface.py Co-authored-by: Copilot <[email protected]> * Update docs/output.md Co-authored-by: Copilot <[email protected]> * Update docs/output.md Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: ezolbooe <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: 17TML <[email protected]> Co-authored-by: Copilot <[email protected]> --------- Co-authored-by: ejiawustl <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: ezolbooe <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: 17TML <[email protected]> Co-authored-by: Zolboo Erdenebaatar <[email protected]> Co-authored-by: zolboo e <[email protected]> Co-authored-by: Copilot <[email protected]>

Zolboo Erdenebaatar and others added 9 commits December 1, 2025 15:37

stepwise modelling

23ce965

with pre-commit

2ed0337

removing fixtures to conftest; fixing spacing in loop module; adding …

768e6ec

…pytest to CI

Remove bin by binding and Add check on number of features and increas…

c5c46c2

…e minimal test case size (BrentLab#54) * saving changes for remove_bin_by_binding * Add check on number of features and increase minimal test case size

adding stage to output best prediction model and output documentation

746a7f8

removing erroneous parameter group after merge

cb21116

cmatKhan requested a review from Copilot December 1, 2025 21:43

Copilot started reviewing on behalf of cmatKhan December 1, 2025 21:43 View session

Copilot finished reviewing on behalf of cmatKhan December 1, 2025 21:45

Copilot AI reviewed Dec 1, 2025

View reviewed changes

docs/output.md Outdated Show resolved Hide resolved

tfbpmodeling/interface.py Outdated Show resolved Hide resolved

docs/output.md Outdated Show resolved Hide resolved

docs/output.md Show resolved Hide resolved

docs/output.md Show resolved Hide resolved

docs/output.md Show resolved Hide resolved

cmatKhan and others added 4 commits December 1, 2025 16:00

adding output.md to mkdocs.yml

a74143a

Update tfbpmodeling/interface.py

b7e0cac

Co-authored-by: Copilot <[email protected]>

Update docs/output.md

bde23c2

Co-authored-by: Copilot <[email protected]>

Update docs/output.md

380320b

Co-authored-by: Copilot <[email protected]>

cmatKhan merged commit b1e94d2 into BrentLab:dev Dec 1, 2025
7 checks passed

cmatKhan deleted the adding_stage1_model_output branch December 1, 2025 23:47

cmatKhan mentioned this pull request Dec 15, 2025

add a step after stage 1 which runs a single CV lasso on the surviving terms and outputs #82

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding stage1 model output #84

Adding stage1 model output #84

Uh oh!

cmatKhan commented Dec 1, 2025

Uh oh!

codecov bot commented Dec 1, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Adding stage1 model output #84

Adding stage1 model output #84

Uh oh!

Conversation

cmatKhan commented Dec 1, 2025

Uh oh!

codecov bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Dec 1, 2025 •

edited

Loading