The OAQA Biomedical Question Answering (BioASQ) System aims to identify relevant documents, concepts and passages (snippets) and automatically generate exact answer texts to arbitrary biomedical questions (factoid, list, yes/no). It won the best-performing system in the BioASQ QA Challenges in the factoid and list categories two years in a row in 2015 and 2016 (see official results).
System description papers have the most details about the design and implementation of the architecture and the algorithms:
- Zi Yang, Niloy Gupta, Xiangyu Sun, Di Xu, Chi Zhang, and Eric Nyberg. Learning to Answer Biomedical Factoid & List Questions: OAQA at BioASQ 3B. In Proceedings of CLEF 2015 Evaluation Labs and Workshop, 2015. [pdf]
- Zi Yang, Yue Zhou, and Eric Nyberg. Learning to Answer Biomedical Questions: OAQA at BioASQ 4B. In Proceedings of Workshop on Biomedical Natural Language Processing, 2016. [pdf]
Please contact Zi Yang if you have any questions or comments.
This system uses the ECD/CSE framework (an extension to the Apache UIMA framework which support formal, declarative YAML-based descriptors for the space of system and component configurations to be explored during system optimization), BaseQA type system as well as various natural language processing and information retrieval algorithms and tools.
The system employs a three layered design for both Java source code and YAML descriptors:
| Layer | Description |
|---|---|
baseqa |
Domain independent QA components, and the basic input/output definition of a QA pipeline, intermediate data objects, QA evaluation components, and data processing components. [source] [descriptor] |
bioqa |
Biomedical resources that can be used in any biomedical QA task (outside the context of BioASQ). [source] [descriptor] |
bioasq |
BioASQ-specific components, e.g. GoPubMed services. [source] [descriptor] |
Each layer contains packages for each processing step, e.g. preprocess, question analysis, abstract query generation, document retrieval and reranking, concept retrieval and reranking, passage retrieval, answer type prediction, evidence gathering, answer generation and ranking. Please refer to the architecture diagrams in the system description papers
We define the following workflow descriptors (i.e. entry points) under bioasq for preprocessing, training, evaluation, and testing the Phase A (retrieval tasks) and Phase B (factoid, list and yes/no answer generation).
| Descriptor | Description |
|---|---|
preprocess-kb-cache |
Cache the requests and responses of concept and concept search services |
preprocess-answer-type-gslabel |
Label gold-standard answer types |
phase-a-train-concept-document |
Train document and concept reranking models |
phase-a-train-snippet |
Train snippet reranking models |
phase-a-evaluate, phase-a-test |
Evaluate (using development subset) and test (using test set) retrieval performance |
phase-b-train-answer-type |
Train answer type prediction model for factoid and list questions |
phase-b-train-answer-score |
Train answer scoring model for factoid and list questions |
phase-b-train-answer-collective-score |
Train answer collective scoring model for list questions |
phase-b-train-yesno |
Train yes/no prediction model |
phase-b-evaluate-factoid-list, phase-b-test-factoid-list |
Evaluate (using development subset) and test (using test set) factoid and list QA |
phase-b-evaluate-yesno, phase-b-test-yesno |
Evaluate (using development subset) and test (using test set) yes/no QA |
A workflow descriptor can be executed by the ECDDriver, which has been configured as the main class in the Maven exec goal, and thus it can be executed from the command line with the config specified as the path.to.the.descriptor.
The system also depends on other types of resources, including dictionaries, pretrained machine learning models, and service related properties.
- Update Lucene from version 5.5.1 to 6.2.1, which results in change of default similarity.
- Update skr-webapi from version 0.0.4 to 0.0.6, due to an upstream API update to version 2.3.
- Update uts-api from version 0.0.2 to 0.0.3, due to an upstream API update.
- Update the TmTool URL to HTTPS (https://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#RESTfulAPIs).
- Bug fixes, including stability enhanced to avoid ConcurrentModificationException in LuceneDocumentScorer and ShapeDistanceCollectiveAnswerScorer, possible DuplicateKey in LuceneInMemorySentenceRetrievalExecutor, retrying if UTS service fails to obtain service ticket.
This system needs to access external structured and unstructured resources for question answering and files for evaluating the system. Due to licensing issues, you may have to obtain these resources or credentials on your own. If you are a CMU OAQA person, please read the internal resource preparation instruction instead.
-
Pre-prerequisites. Java 8, Maven 3, Python 2.
-
(Recommended) UMLS license/account. The system needs to access the online UMLS services (UTS and MetaMap), which require UMLS license/account (username, password, email). You can request them from https://uts.nlm.nih.gov//license.html. Otherwise, you need to remove all the
*-uts-*and*-metamap-*steps from the descriptors, which will hugely hurt the performance.If you want to increase the system's throughput, you may consider to download and install local instances of UMLS and MetaMap services. Currently, we only have the Web services integrated.
-
(Recommended) Medline corpus and Lucene index. The system can use a local Medline index or the GoPubMed Web API for searching the PubMed. However, we recommend a local index because the reranking component may send up to hundreds of search requests per question. Using a Web API can take forever to process one question.
-
Download
.xml.gzor.xmlfiles from https://www.nlm.nih.gov/databases/download/pubmed_medline.html. -
(Optional) Check out the
medline-indexerproject. -
Create a Lucene index using the
StandardAnalyzer. The index should contain three mandatory fields:pmid,abstractText, andarticleTitle. We include an example Java codeMedlineCitationIndexer.javathat indexes.xml.gzor.xmlfiles inside a directory. -
Create a sqlite database that has a
pmid2abstracttable with two fieldspmidandabstract, which is used to fix the section label errors in the provided development set. We include an example Java codeMedlineAbstractStoreBuilder.javathat builds the sqlite file.
-
-
Biomedical ontology dumps and Lucene index. You can skip this step if you don't need relevant concept retrieval, but please also remove the
concept-retrievalandconcept-reranksteps from the descriptors if you do so. If you prefer using a local biomedical ontology index (recommended) to the official GoPubMed services, you need to obtain the ontology dumps and create a Lucene index.-
Download the ontology dumps from all the sources according to the official resources guideline document.
-
(Optional) Check out the
biomedical-concept-indexerproject. -
Create a Lucene index. The index should contain four mandatory fields:
id,name,definition, andsource. Different sources of ontologies need to be adapted into the same single schema, and specify thesourceandidof the concept in the original ontology source.Definitionandnamefields are intended to be used for retrieval. We include an example Java codeBiomedicalConceptIndexer.javathat indexes multiple ontologies.
-
-
BioASQ development and test files. You will need the test files for
*-test-*workflows and the development files for*-evaluate-*and*-train-*workflows. However, the official development file has various errors. We created a python scriptbioasq-dev-fixer.pyto fix the errors, includeupdate_year,fix_go_url,normalize_yesno_answer,listify_ideal_answer,listify_exact_answer,split_parenthesis_answer,fix_section_label, etc.-
Obtain the test set and development set (containing the gold-standard answers) from the BioASQ website.
-
Install the Python
editdistancepackage. -
Use the provided script to fix the formatting errors in the development file.
python bioasq-dev-fixer.py path_to_orig_4b_dev_set path_to_pmid2abstract_db 4b-dev.json.auto.fulltext -
The resulting file should have a md5 of
8751b3a962eafb5c2aa8f09d5998fcd4.
-
-
(Optional) PubMed Central corpus and document service. Since the PubMed Central full text is no longer used in the evaluation from BioASQ 2016, it is not integrated into the predefined workflow descriptors. If you plan to use the PubMed Central corpus for passage retrieval (see below), you also need to download the PMC corpus and set up a document server.
-
Download the PMC open access subset: https://www.ncbi.nlm.nih.gov/pmc/tools/ftp/
-
Use the BioASQTasks.jar (provided in the official preparation package prior to 2015) to convert the xml files to a single JSON Array file.
java -jar BioASQTasks.jar -
Create a directory
pmcand split the JSON Array file to individual JSON documents, each containing a single document and named by its PMID, and put into thepmcdirectory. -
Set up a HTTP document server with the resource root being the directory that contains
pmcdirectory. Make sure you can access each document using the URL:http://HOST:PORT/pmc/DOC_ID.
-
-
Clone the project into a local directory.
-
Put the test json files into the
inputdirectory, and rename them todryrun-a.json,dryrun-b.json,1b-1-a.json, ...,4b-5-b.json. (Read thecollection-reader.fileparameter value in each descriptor to understand what the system will look for.) If you use a customized input directory and/or json file names, please change thecollection-reader.fileparameter in the workflow descriptor. -
Create the
resultdirectory under the project folder, which is used for the system final output. If you use a customized output directory, you can change the following descriptors -
Create the
persistencedirectory and download theoaqa-cse.db3file into thepersistencefolder. As this project uses the CSE framework, the sqlite database persists the experiment metadata, the intermediate data objects (optionally) and the evaluation results for debugging and reporting purposes. If you use a customized persistence directory and/or file name, you can create your ownpersistence-providerdescriptor and update thepersistence-providerparameters wherever used. -
Create
concept-search-cache,metamap-cache,synonym-cache, andtmtool-cachedirectories undersrc/main/resources/directory. If you don't need the cache, you can replace the*-cacheddescriptors with the non-cached versions (direct access). If you use a customized cache directories, you need to update thedb-fileparameter in the*-cacheddescriptors, includingresources/bioqa/providers/kb/concept-search-uts-cached.yaml.templateresources/bioqa/providers/kb/metamap-cached.yaml.templateresources/bioqa/providers/kb/synonym-uts-cached.yaml.templateresources/bioqa/providers/kb/tmtool-cached.yaml.template
(Checkpoint) At this point, the project structure should look like this unless you have customized it.
|-- bioasq/ | |-- input/ | | |-- 1b-1-a.json | | |-- . | | |-- . | | |-- . | | |-- 4b-5-b.json | | |-- 4b-dev.json.auto.fulltext | | |-- dryrun-a.json | | |-- dryrun-b.json | | |-- one-question.json | |-- persistence/ | | |-- oaqa-cse.db3 | |-- result/ | |-- src/ | | |-- main/ | | | |-- java/ | | | |-- resources/ | | | | |-- baseqa/ | | | | |-- bioasq/ | | | | |-- bioqa/ | | | | |-- concept-search-cache/ | | | | |-- dictionaries/ | | | | |-- metamap-cache/ | | | | |-- models/ | | | | |-- synonym-cache/ | | | | |-- tmtool-cache/ | | | |-- script/ -
Update the
indexparameter in thelucene-bioconceptdescriptors with the path to the Lucene index for the biomedical ontology. Also, you need to change other parameters if you use customized field names. Remove the.templatesuffix from the file names, including -
Update the
indexparameter in thelucene-medlinedescriptors with the path to the Lucene Medline index. Also, you need to change other parameters if you use customized field names. Remove the.templatesuffix from the file names, including -
Update the
version,username,password, andemailparameters in theutsandmetamaprelated providers, and remove the.templatesuffix from the file names, includingresources/bioqa/providers/kb/concept-search-uts.yaml.templateresources/bioqa/providers/kb/concept-search-uts-cached.yaml.templateresources/bioqa/providers/kb/metamap.yaml.templateresources/bioqa/providers/kb/metamap-cached.yaml.templateresources/bioqa/providers/kb/synonym-uts.yaml.templateresources/bioqa/providers/kb/synonym-uts-cached.yaml.template
Note that the
versionparameter takes a string value, which means you have to add single or double quotes around the metamap version number, e.g.'1516'. Otherwise, YAML would intepret1516as an integer. -
Install the dependencies and compile the resources via Maven:
mvn clean compileWhen you see
BUILD SUCCESS, the installation is done.
-
(Optional, Recommended) Execute the
preprocess-kb-cacheworkflow if you haven't done yet:mvn exec:exec -Dconfig=bioasq.preprocess-kb-cacheAt the end of the execution, you should see mapdb files generated in the
*-cachedirectories. This step could be extremely slow (> 10 hours) depending on the workload on the UTS/MetaMap servers. -
Execute any
*-test-*workflow descriptor to test the pipeline:mvn exec:exec -Dconfig=bioasq.phase-a-test mvn exec:exec -Dconfig=bioasq.phase-b-test-factoid-list mvn exec:exec -Dconfig=bioasq.phase-b-test-yest-no -
You should see the output in the
resultdirectory at the end of each execution.
The common evaluation metrics are defined in the BaseQA project's eval package. The system extends the evaluation metrics for the BioASQ task in the eval package. All the *-evaluate-* descriptors add additional post-processing steps to generate the evaluation results automatically.
-
Put the
4b-dev.json.auto.fulltextfile under the directoryinputif you haven't done yet. If you use a customized directory and/or file name, you need to change theresources/bioasq/gs/bioasq-qa-decorator.yamldescriptor content accordingly. -
(Optional, Recommended) Execute the
preprocess-kb-cacheworkflow if you haven't done yet, and at the end of the execution, you should see mapdb files generated in the*-cachedirectories. -
Execute any
*-evaluate-*workflow descriptor to test the pipeline.mvn exec:exec -Dconfig=bioasq.phase-a-evaluate mvn exec:exec -Dconfig=bioasq.phase-b-evaluate-factoid-list mvn exec:exec -Dconfig=bioasq.phase-b-evaluate-yest-no -
You should see the evaluation results at the end of the execution in the console.
For example,
Experiment: 8f5876cc-7dcf-41c2-9da3-7fe841ae92d9:1 traceId,Answer/Answer/YESNO_COUNT,Answer/Answer/YESNO_MEAN_ACCURACY,Answer/Answer/YESNO_MEAN_NEG_ACCURACY,Answer/Answer/YESNO_MEAN_POS_ACCURACY 1|QuestionParser[inherit:bioqa.question.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics]>2|QuestionConceptRecognizer[inherit:bioqa.question.concept.metamap-cached#concept-provider:inherit: bioqa.providers.kb.metamap-cached]>3|QuestionConceptRecognizer[inherit:bioqa.question.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached]>4|QuestionConceptRecognizer[inherit:bioqa.question.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia]>5|PassageToViewCopier[inherit:baseqa.evidence.passage-to-view#view-name-prefix:ptv]>6|PassageParser[inherit:bioqa.evidence.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics#view-name-prefix:ptv]>7|PassageConceptRecognizer[inherit:bioqa.evidence.concept.metamap-cached#allowed-concept-types:/dictionaries/allowed-umls-types.txt#concept-provider:inherit: bioqa.providers.kb.metamap-cached#view-name-prefix:ptv]>8|PassageConceptRecognizer[inherit:bioqa.evidence.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached#view-name-prefix:ptv]>9|PassageConceptRecognizer[inherit:bioqa.evidence.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia#view-name-prefix:ptv]>10|PassageConceptRecognizer[inherit:baseqa.evidence.concept.frequent-phrase#concept-provider:inherit: baseqa.providers.kb.frequent-phrase#view-name-prefix:ptv]>11|ConceptSearcher[inherit:bioqa.evidence.concept.search-uts-cached#concept-search-provider:inherit: bioqa.providers.kb.concept-search-uts-cached#synonym-expansion-provider:inherit: bioqa.providers.kb.synonym-uts-cached]>12|ConceptMerger[inherit:baseqa.evidence.concept.merge#include-default-view:true#view-name-prefix:ptv#use-name:true]>13|YesNoAnswerPredictor[inherit:bioqa.answer.yesno.liblinear-predict#classifier:inherit: bioqa.answer.yesno.liblinear#feature-file:result/answer-yesno-predict-liblinear.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ],28.0000,0.5714,0.3333,0.6842 1|QuestionParser[inherit:bioqa.question.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics]>2|QuestionConceptRecognizer[inherit:bioqa.question.concept.metamap-cached#concept-provider:inherit: bioqa.providers.kb.metamap-cached]>3|QuestionConceptRecognizer[inherit:bioqa.question.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached]>4|QuestionConceptRecognizer[inherit:bioqa.question.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia]>5|PassageToViewCopier[inherit:baseqa.evidence.passage-to-view#view-name-prefix:ptv]>6|PassageParser[inherit:bioqa.evidence.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics#view-name-prefix:ptv]>7|PassageConceptRecognizer[inherit:bioqa.evidence.concept.metamap-cached#allowed-concept-types:/dictionaries/allowed-umls-types.txt#concept-provider:inherit: bioqa.providers.kb.metamap-cached#view-name-prefix:ptv]>8|PassageConceptRecognizer[inherit:bioqa.evidence.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached#view-name-prefix:ptv]>9|PassageConceptRecognizer[inherit:bioqa.evidence.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia#view-name-prefix:ptv]>10|PassageConceptRecognizer[inherit:baseqa.evidence.concept.frequent-phrase#concept-provider:inherit: baseqa.providers.kb.frequent-phrase#view-name-prefix:ptv]>11|ConceptSearcher[inherit:bioqa.evidence.concept.search-uts-cached#concept-search-provider:inherit: bioqa.providers.kb.concept-search-uts-cached#synonym-expansion-provider:inherit: bioqa.providers.kb.synonym-uts-cached]>12|ConceptMerger[inherit:baseqa.evidence.concept.merge#include-default-view:true#view-name-prefix:ptv#use-name:true]>13|AllYesYesNoAnswerPredictor[inherit:baseqa.answer.yesno.all-yes],28.0000,0.6786,0.0000,1.0000 1|QuestionParser[inherit:bioqa.question.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics]>2|QuestionConceptRecognizer[inherit:bioqa.question.concept.metamap-cached#concept-provider:inherit: bioqa.providers.kb.metamap-cached]>3|QuestionConceptRecognizer[inherit:bioqa.question.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached]>4|QuestionConceptRecognizer[inherit:bioqa.question.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia]>5|PassageToViewCopier[inherit:baseqa.evidence.passage-to-view#view-name-prefix:ptv]>6|PassageParser[inherit:bioqa.evidence.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics#view-name-prefix:ptv]>7|PassageConceptRecognizer[inherit:bioqa.evidence.concept.metamap-cached#allowed-concept-types:/dictionaries/allowed-umls-types.txt#concept-provider:inherit: bioqa.providers.kb.metamap-cached#view-name-prefix:ptv]>8|PassageConceptRecognizer[inherit:bioqa.evidence.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached#view-name-prefix:ptv]>9|PassageConceptRecognizer[inherit:bioqa.evidence.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia#view-name-prefix:ptv]>10|PassageConceptRecognizer[inherit:baseqa.evidence.concept.frequent-phrase#concept-provider:inherit: baseqa.providers.kb.frequent-phrase#view-name-prefix:ptv]>11|ConceptSearcher[inherit:bioqa.evidence.concept.search-uts-cached#concept-search-provider:inherit: bioqa.providers.kb.concept-search-uts-cached#synonym-expansion-provider:inherit: bioqa.providers.kb.synonym-uts-cached]>12|ConceptMerger[inherit:baseqa.evidence.concept.merge#include-default-view:true#view-name-prefix:ptv#use-name:true]>13|YesNoAnswerPredictor[inherit:bioqa.answer.yesno.weka-logistic-predict#classifier:inherit: bioqa.answer.yesno.weka-logistic#feature-file:result/answer-yesno-predict-weka-logistic.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ],28.0000,0.6429,0.2222,0.8421 1|QuestionParser[inherit:bioqa.question.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics]>2|QuestionConceptRecognizer[inherit:bioqa.question.concept.metamap-cached#concept-provider:inherit: bioqa.providers.kb.metamap-cached]>3|QuestionConceptRecognizer[inherit:bioqa.question.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached]>4|QuestionConceptRecognizer[inherit:bioqa.question.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia]>5|PassageToViewCopier[inherit:baseqa.evidence.passage-to-view#view-name-prefix:ptv]>6|PassageParser[inherit:bioqa.evidence.parse.clearnlp-bioinformatics#parser-provider:inherit: bioqa.providers.parser.clearnlp-bioinformatics#view-name-prefix:ptv]>7|PassageConceptRecognizer[inherit:bioqa.evidence.concept.metamap-cached#allowed-concept-types:/dictionaries/allowed-umls-types.txt#concept-provider:inherit: bioqa.providers.kb.metamap-cached#view-name-prefix:ptv]>8|PassageConceptRecognizer[inherit:bioqa.evidence.concept.tmtool-cached#concept-provider:inherit: bioqa.providers.kb.tmtool-cached#view-name-prefix:ptv]>9|PassageConceptRecognizer[inherit:bioqa.evidence.concept.lingpipe-genia#concept-provider:inherit: bioqa.providers.kb.lingpipe-genia#view-name-prefix:ptv]>10|PassageConceptRecognizer[inherit:baseqa.evidence.concept.frequent-phrase#concept-provider:inherit: baseqa.providers.kb.frequent-phrase#view-name-prefix:ptv]>11|ConceptSearcher[inherit:bioqa.evidence.concept.search-uts-cached#concept-search-provider:inherit: bioqa.providers.kb.concept-search-uts-cached#synonym-expansion-provider:inherit: bioqa.providers.kb.synonym-uts-cached]>12|ConceptMerger[inherit:baseqa.evidence.concept.merge#include-default-view:true#view-name-prefix:ptv#use-name:true]>13|YesNoAnswerPredictor[inherit:bioqa.answer.yesno.weka-cvr-predict#classifier:inherit: bioqa.answer.yesno.weka-cvr#feature-file:result/answer-yesno-predict-weka-cvr.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ],28.0000,0.6071,0.6667,0.5789For better visualization, you can split the lines into cells using the comma separators, like this:
traceId COUNT ACCURACY NEG_ACCURACY POS_ACCURACY `...>13 YesNoAnswerPredictor[inherit:bioqa.answer.yesno.liblinear-predict#classifier:inherit: bioqa.answer.yesno.liblinear#feature-file:result/answer-yesno-predict-liblinear.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ]` 28.0000 0.5714 0.3333 `...>13 AllYesYesNoAnswerPredictor[inherit:baseqa.answer.yesno.all-yes]` 28.0000 0.6786 0.0000 `...>13 YesNoAnswerPredictor[inherit:bioqa.answer.yesno.weka-logistic-predict#classifier:inherit: bioqa.answer.yesno.weka-logistic#feature-file:result/answer-yesno-predict-weka-logistic.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ]` 28.0000 0.6429 0.2222 `...>13 YesNoAnswerPredictor[inherit:bioqa.answer.yesno.weka-cvr-predict#classifier:inherit: bioqa.answer.yesno.weka-cvr#feature-file:result/answer-yesno-predict-weka-cvr.tsv#scorers:- inherit: baseqa.answer.yesno.scorers.concept-overlap - inherit: bioqa.answer.yesno.scorers.token-overlap - inherit: baseqa.answer.yesno.scorers.expected-answer-overlap - inherit: baseqa.answer.yesno.scorers.sentiment - inherit: baseqa.answer.yesno.scorers.negation - inherit: bioqa.answer.yesno.scorers.alternate-answer ]` 28.0000 0.6071 0.6667
The system includes pretrained models using the predefined *-train-* descriptors (i.e. 4b-dev set minus 3b-5 test set). However, if you plan to retrain the models, you can follow these steps. Please be aware that the models are saved under resources/models, and loaded from classpath directly, which means you might want to recompile the project using mvn clean compile to copy the newly generated models into the target directory between the training processes, so that the next training can use the models from the previous one.
-
Put the
4b-dev.json.auto.fulltextfile under the directoryinputif you haven't done so. If you use a customized directory and/or gold-standard file, you need to change theresources/bioasq/gs/bioasq-qa-decorator.yamldescriptor content accordingly. -
(Optional, Recommended) Execute the
preprocess-kb-cacheworkflow if you haven't done yet, and at the end of the execution, you should see mapdb files generated in the*-cachedirectories. -
Execute the
preprocess-answer-type-gslabelworkflow if you haven't done yet, and at the end of the execution, you should see4b-dev-gslabel-tmtool.jsonand4b-dev-gslabel-uts.jsonfiles generated in theresources/models/bioqa/answer_typedirectories.mvn clean compile exec:exec -Dconfig=bioasq.preprocess-answer-type-gslabelThis step could take about 30 minutes.
-
Training Phase A requires execution of
phase-a-train-concept-documentbeforephase-a-train-snippet.mvn clean compile exec:exec -Dconfig=bioasq.phase-a-train-concept-document mvn clean compile exec:exec -Dconfig=bioasq.phase-a-train-snippetExecuting
phase-a-train-concept-documentcould take 3-4 hours, and executingphase-a-train-snippetcould take 80 minutes.Training Phase B factoid and list QA requires execution of
phase-b-train-answer-typefirst, thenphase-b-train-answer-score, and finallyphase-b-train-answer-collective-score.mvn clean compile exec:exec -Dconfig=bioasq.phase-b-train-answer-type mvn clean compile exec:exec -Dconfig=bioasq.phase-b-train-answer-score mvn clean compile exec:exec -Dconfig=bioasq.phase-b-train-answer-collective-scoreExecuting
phase-b-train-answer-typeorphase-b-train-answer-scorecould take 30 minutes each. Executingphase-b-train-answer-collective-scorecould take 10 minutes.Training Phase B yes/no QA requires execution of
phase-b-train-yesno.mvn clean compile exec:exec -Dconfig=bioasq.phase-b-train-yesnoExecuting
phase-b-train-answer-collective-scorecould take about 10 minutes. -
You should see cross-validation results at the end of each training.
You can use your own biomedical questions to test the system in either Phase A or Phase B, similar to testing on BioASQ test set.
-
You can refer to the
input/one-question.jsonfile, and update the question.{ "questions": [ { "body": "What is the role of MMP-1 in breast cancer?", "type": "factoid", "id": "0" } ] } -
You need to change the
collection-reader.fileparameter toinput/one-question.jsonin thephase-a-testdescriptor to test Phase A.
-
You need to manually add relevant snippets to the
input/one-question.jsonfile, similar to the Phase B test file (i.e.*b-*-b.json). -
You need to change the
collection-reader.fileparameter toinput/one-question.jsonin thephase-b-test-factoid-listdescriptor to test Phase B.
We are working on testing an end-to-end QA system that combines Phase A and Phase B workflows. You may also creatively combine the steps from both descriptors on your own.
Since the PubMed Central full text is not used in the evaluation from BioASQ 2016, it is not integrated into the predefined workflow descriptors. However, you can still use it for relevant passage retrieval.
-
Make sure you have the PubMed Central full text and document server.
-
Update the
url-formatparameter in theresources/bioasq/passage/pmc-content.yaml.templatewith the PubMed Central document server URL, and remove the.templatesuffix from the file name. -
Add the
pmc-contentstep after thedocument-retrieval/document-rerankstep, but beforepassage-retrievalstep, in the descriptor.
The official GoPubMed is sometimes slow. If you use a local or proxy GoPubMed server different from the official server, as those specified in the properties folder, and you plan to use the GoPubMed components, which are not used the predefined workflow descriptors, you can change the conf parameter in the gopubmed related descriptors, including
resources/bioasq/concept/retrieval/gopubmed.yamlresources/bioasq/concept/retrieval/gopubmed-separate.yamlresources/bioasq/concept/rerank/scorers/gopubmed.yamlresources/bioasq/document/retrieval/gopubmed.yamlresources/bioasq/triple/retrieval/gopubmed.yaml
The system is far from perfect, and it needs tuning and component development. In addition to the system description papers, you may also read the UIMA and OAQA Tutorial to get familiar with the UIMA/ECD/CSE frameworks used by this system.
We thank Ying Li, Xing Yang, Venus So, James Cai and the other team members at Roche Innovation Center New York for their support of OAQA and biomedical question answering research and development.
This project is licensed under the Apache License ver 2.0 - see the LICENSE.txt file for details. However, please note that some third-party dependencies may be licensed differently.


