Skip to content

Conversation

@kshiarl1
Copy link

@kshiarl1 kshiarl1 commented Dec 5, 2025

Description

Closes

Red Teaming functionality using function middleware.

This PR introduces the necessary ingredients to perform AI red teaming on an agent workflow.
These are:

  • Intercepting function and adding attack payloads (prompt injections)
  • Evaluating the effect of these payloads on other parts of the workflow.
  • Attack scenario definition and orchestration.

Key Features

Implements RedTeamingMiddleware.

  • Special middleware that can replace inputs and output strings and floats to functions.
  • Configurable payloads to replace i/o with.
  • Can search within function schema to find the right fields to introduce the payload.

Implements RedTeamingEvaluationRunner.

  • Allows the user to run red teaming using different threat scenarios.
  • Threat scenarios are implemented via the RedTeamingEvaluationRunnerConfig.
  • Processes, summarizes and saves results for ease of use post evaluation.

Implements a RedTeamingEvaluator.

  • Can filter intermediate steps to find specific function inputs / outputs.
  • Uses LLM as a judge to determine whether an attack is successful.
  • Accepts scenario specific instructions for evaluation.

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

@kshiarl1 kshiarl1 requested a review from a team as a code owner December 5, 2025 14:08
@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 5, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link

coderabbitai bot commented Dec 5, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (2)
  • develop
  • release/.*

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

lvojtku and others added 2 commits December 5, 2025 17:07
Closes
- Brought out agents to emphasize on what they offer and how users can use them
- Removed any repeated details in their overview sections / features sections
- Rephrased verbiage for clarity
- Ran docs through cursor to ensure it meets style guide requirements

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.



## Summary by CodeRabbit

* **Documentation**
  * Added docs for new agents: ReAct, Reasoning, ReWOO, Router, Tool Calling, and Responses API & Agent
  * Added Sequential Executor documentation
  * Consolidated workflow navigation to a single About page and updated related links across docs and examples
  * Rewrote many workflow pages to emphasize configuration-first guides with YAML examples and clearer installation instructions
  * Updated quick-start link labels and multiple example README links for consistency

<sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>

Authors:
  - https://github.com/lvojtku
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - David Gardner (https://github.com/dagardner-nv)

URL: NVIDIA#1173
This PR:
1. Adds an a2a subpackage with support for a2a client and servers. This is base support which models the remote-agent as a tool.
2. Adds examples demonstrating NAT workflows with NAT A2A servers (math assistant) and external A2A servers (currency agent)
3. Adds a2a CLI commands for troubleshooting
4. Adds docs
5. Adds unit and end2end tests

Pending items that will be addressed in a separate PR:
1. Data part and  file part support
2. Auth
3. Unified Telemetry and Logging

**Sever CLI** 
1. Start server using the NAT a2a frontend:
```
 nat a2a serve --config_file examples/getting_started/simple_calculator/configs/config.yml
```

**Client CLI:**
1. agent card discovery
```
nat a2a client discover --url http://localhost:10000
```
<img width="1114" height="611" alt="image" src="https://github.com/user-attachments/assets/d7edca10-5bf9-4804-b1b0-41a337580b2c" />

2. high level agent call
```
nat a2a client call --url http://localhost:10000 --message "Is the product of 2 and 4 greater than the hour of the day"
```
```
Query: Is the product of 2 and 4 greater than the hour of the day

No, the product of 2 and 4 is less than the hour of the day.
```

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.



## Summary by CodeRabbit

* **New Features**
  * Agent-to-Agent (A2A) support: client & server integrations, workflow publishing, and CLI commands (serve, discover, info, skills, call).

* **Documentation**
  * Comprehensive A2A docs: protocol overview, client/server/CLI guides, installation, configuration, examples, usage samples, and troubleshooting.

* **Examples**
  * Two end-to-end A2A examples with READMEs, configs, sample queries, and project setup.

* **Tests**
  * Client, server, agent-card generation, and end-to-end workflow tests.

* **Chores**
  * Packaging, license, CLI wiring, and project configuration for new A2A package.

<sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub>

Authors:
  - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah)

Approvers:
  - David Gardner (https://github.com/dagardner-nv)
  - Will Killian (https://github.com/willkill07)
  - Yuchen Zhang (https://github.com/yczhang-nv)

URL: NVIDIA#1147
score: typing.Any # float or any serializable type
reasoning: typing.Any

EvaluatorTemplateItem = TypeVar('EvaluatorTemplateItem', bound=EvalOutputItem)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert the TypeVar and Generic[EvaluatorTemplateItem] changes to EvalOutput. This added complexity is unused as the simpler non-generic class works fine since subclasses like RedTeamingEvalOutputItem already inherit from EvalOutputItem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not work. Here is the problem. I am happy to discuss a different solution but the problem is there and it prevent extensibility of the whole evaluation framework so it needs to be solved:

If EvalOutput is constructed with RedTeamingEvalOutputItem instead of EvalOutputItem and then model_dump_json is called on EvalOutput (in order to save the result) it will dump only the fields in EvalOutputItem. This model_dump_json operation happens in EvaluationRun.

# create json content using the evaluation results

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplest solution is to change the typing inside EvalOutput to be list[Any] instead of list[EvalOutputItem]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you would like to verify yourself that there is a problem try running this standalone script:

from pydantic import BaseModel
from typing import List

class EvalOutputItem(BaseModel):
base_field: str

class NewEvalOutputItem(EvalOutputItem):
new_field: str # This gets lost during serialization

class EvalOutput(BaseModel):
items: List[EvalOutputItem] # Type annotation limits serialization

#This works fine
new_item = NewEvalOutputItem(base_field="test", new_field="extra")
eval_output = EvalOutput(items=[new_item])

#But this loses the new_field
json_str = eval_output.model_dump_json()
#Only serializes EvalOutputItem fields, not NewEvalOutputItem fields
print(json_str)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we are looking for here is:

eval_output_items: list[SerializeAsAny[EvalOutputItem]] 

Section: Serializing with duck typing 🦆

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we are looking for here is:

eval_output_items: list[SerializeAsAny[EvalOutputItem]] 

Section: Serializing with duck typing 🦆

'''Register red teaming evaluator'''
from .evaluate import RedTeamingEvaluator

llm = await builder.get_llm(config.llm_name, wrapper_type=LLMFrameworkEnum.LANGCHAIN)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hard-coded wrapper_type=LLMFrameworkEnum.LANGCHAIN is fragile, the register function shouldn't need to know what framework the evaluator uses internally. Consider following the pattern from the DynamicFunctionMiddleware implementation, which passes the builder instance and patches get_llm to discover and wrap components at runtime. This decouples the registration from the implementation details and allows the evaluator to request whatever framework it needs without the register function making assumptions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took this from here:

async def register_tunable_rag_evaluator(config: TunableRagEvaluatorConfig, builder: EvalBuilder):

and it is also the case here:

llm = await builder.get_llm(config.llm_name, wrapper_type=LLMFrameworkEnum.LANGCHAIN)

so basically all LLMJudge evaluators in NAT use the same wrapper.

could you point me out to what you mean exactly?

logger = logging.getLogger(__name__)

# Fixed LLM name for red teaming evaluator
RED_TEAMING_EVALUATOR_LLM_NAME = "red_teaming_evaluator_llm"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid injecting hardcoded names into already defined components. It's acceptable to add a new component with its full configuration (e.g., adding a new evaluator or LLM), but if a component is already configured by the user, the runner should only modify its parameters—not override its name or type.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not yet in the full configuration. The RedTeamingEvaluationRunner constructs the full configuration it just forces the name.

return self

@classmethod
def rebuild_annotations(cls) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base workflow file should remain the source of truth for type information. The current RedTeamingRunnerConfig creates a parallel type system by defining fields like evaluator_llm: LLMBaseConfig and requiring rebuild_annotations() to stay in sync with the workflow's type registry—this is fragile and overcomplicated. Instead, the scenarios schema should work with the base workflow through mutations (path-value changes) rather than duplicating its type definitions. If a user has an LLM configured in their workflow, the scenario should reference it by name instead of injecting a new one with a hardcoded key. This would eliminate the need for rebuild_annotations(), remove type coupling between scenarios and workflows, respect user-configured values, and avoid duplicating type registration efforts. While this approach works for the red teaming evaluator since you own that component, we should not couple concerns this way for other components. Let's discuss a better pattern for converting scenarios to workflow configurations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy do discuss a different solution. I feel like the current solution is best for the user as it makes the components of red teaming much more explicit and explicitly ties the evaluator with the scenarios (as is the case). I don't see how this overcomplicates things as these operations are strictly performed for the red teaming workflow that can only be done using the red-team command.

logger = logging.getLogger(__name__)


class RedTeamingRunner:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RedTeamingRunner should inherit from MultiEvaluationRunner rather than being a standalone class that wraps it internally. The current design instantiates MultiEvaluationRunner inside run() and delegates to run_all(), which is essentially inheritance without the benefits. By extending MultiEvaluationRunner, the red teaming runner can override init to handle scenario-based config transformation, call super().run_all() for execution, and add pre/post-processing in an overridden run_all() method. This makes the relationship explicit, reduces indirection, and follows standard patterns for specialized runners.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could make the inheritance for the RedTeamingEvaluationRunner work. But because injeriting the config is not recommended in my opinion it might be better to not do this. See below.


def __init__(
self,
config: RedTeamingRunnerConfig | None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, RedTeamingRunnerConfig should extend MultiEvaluationRunConfig. Currently, the runner manually transforms RedTeamingRunnerConfig into a MultiEvaluationRunConfig before passing it to MultiEvaluationRunner. If the config class extended MultiEvaluationRunConfig, the inheritance hierarchy would be consistent between the configs and runners, and the transformation could be simplified or handled through inheritance rather than manual conversion.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure about this? The whole point of RedTeamingRunnerConfig is that it can be loaded and instantiated from yaml / json...If i inherit then I will also inherit the mandatory configs field that does not exist in the stored config. The two configs have nothing to do with each other.

- Add RedTeamingEvaluator class with LLM judge support
- Add data models (ConditionEvaluationResult, RedTeamingEvalOutputItem)
- Add filter conditions for selecting intermediate steps
- Add evaluator registration and configuration
- Add comprehensive README documentation
- Add RedTeamingEvaluationRunner for orchestrating red team scenarios
- Add RedTeamingEvaluationConfig and RedTeamScenarioEntry data models
- Add support for loading scenarios from JSON and applying middleware
- Export runner classes in runners/__init__.py
- Add complete red teaming example for simple calculator workflow
- Add calculator middleware for demonstrating middleware pattern
- Add workflow configuration and test datasets
- Add red team scenarios JSON with middleware test cases
- Add run_redteam_eval.py script for executing evaluations
- Add comprehensive README with usage instructions and examples
@kshiarl1 kshiarl1 requested a review from a team as a code owner December 8, 2025 13:05
@ericevans-nv ericevans-nv self-assigned this Dec 9, 2025
@ericevans-nv ericevans-nv added feature request New feature or request non-breaking Non-breaking change labels Dec 9, 2025
Copy link
Contributor

@ericevans-nv ericevans-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the update I mentioned to use SerializeAsAny instead of the generics, and we can address the LLM-wrapper related changes later. The current changes are fine to merge just don’t pull dev in yet. Once everyone gets their first PR in, we can update the epic branch. Comment /merge no squash to retain your commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants