Skip to content

Conversation

@Warashi
Copy link
Contributor

@Warashi Warashi commented Dec 19, 2025

This PR adds availability to use Amazon Bedrock models to create embeddings.
Additionally, this PR makes us allow to pass AWS_SESSION_TOKEN environment variable to boto3 client, and add documents about environment variables to use AWS Bedrock.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Warashi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the application's capabilities by integrating AWS Bedrock models as a new embedding provider. This enhancement offers users greater flexibility in choosing their preferred AI infrastructure for generating embeddings. The changes involve updating core configuration, enhancing the Bedrock client to handle embedding-specific requests, and seamlessly integrating this new provider into the existing data processing and real-time chat functionalities.

Highlights

  • AWS Bedrock Integration: Introduced comprehensive support for AWS Bedrock models, allowing them to be used as an embedding provider within the application.
  • New Environment Variables: Added new environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION, AWS_ROLE_ARN) to configure AWS Bedrock authentication and region settings.
  • Bedrock Client Enhancements: The BedrockClient now supports embedding calls, including proper parsing of Bedrock embedding responses and formatting request bodies for different Bedrock providers like Amazon Titan and Cohere. It also includes serialization/deserialization methods for client objects.
  • Configuration Updates: Updated the embedder.json configuration file and api/config.py to recognize 'bedrock' as a valid embedder type and manage its specific configuration settings.
  • Token Counting and WebSocket Integration: Modified the token counting utility to support Bedrock embedders and integrated Bedrock models into the WebSocket chat functionality for chat completions, including robust error handling.
  • Unit Test Coverage: Expanded unit tests to cover the new Bedrock embedder configuration, type detection, and request formatting, ensuring the stability and correctness of the integration.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for AWS Bedrock as a new embedder and LLM provider. The changes involve updating documentation (README.md) to include Bedrock-related environment variables and DEEPWIKI_EMBEDDER_TYPE options. The core BedrockClient (api/bedrock_client.py) is enhanced to handle AWS session tokens, support serialization, parse embedding responses, and format request bodies for Amazon Titan and Cohere embedding models. It also extends the call method to process embedding requests and convert_inputs_to_api_kwargs for both LLM and embedder types. Configuration files (api/config.py, api/config/embedder.json) are updated to load and prioritize Bedrock embedder settings. The token counting logic (api/data_pipeline.py) and embedder retrieval functions (api/tools/embedder.py) are adjusted to recognize the new 'bedrock' type. Finally, the WebSocket API (api/websocket_wiki.py) is modified to integrate Bedrock as a chat completion provider, including client initialization, credential checks, and error handling. Review comments highlight critical issues: the body parameter for boto3.invoke_model in api/bedrock_client.py needs json.dumps() for proper serialization, an if statement in api/websocket_wiki.py should be an elif for correct provider chaining, a block of code for the cohere provider in api/bedrock_client.py is identified as dead code and should be removed, and the Bedrock LLM implementation in api/websocket_wiki.py lacks streaming support, which is available via invoke_model_with_response_stream and should be implemented for a better user experience.

Comment on lines +623 to +640
elif request.provider == "bedrock":
try:
logger.info("Making AWS Bedrock API call")
response = await model.acall(api_kwargs=api_kwargs, model_type=ModelType.LLM)
if isinstance(response, str):
await websocket.send_text(response)
else:
await websocket.send_text(str(response))
await websocket.close()
except Exception as e_bedrock:
logger.error(f"Error with AWS Bedrock API: {str(e_bedrock)}")
error_msg = (
f"\nError with AWS Bedrock API: {str(e_bedrock)}\n\n"
"Please check that you have set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY "
"environment variables with valid credentials."
)
await websocket.send_text(error_msg)
await websocket.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The implementation for the Bedrock provider does not support streaming responses, while other providers in this file do. This results in a degraded user experience as the user has to wait for the entire response to be generated. AWS Bedrock supports streaming via invoke_model_with_response_stream. You should update the BedrockClient to support streaming and use it here to provide a better, more interactive experience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not in the scope of this PR, so it should be addressed by another PR.

Warashi and others added 2 commits December 20, 2025 05:08
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@Warashi Warashi changed the title feat: using AWS Bedrock models as embedding provider feat: using Amazon Bedrock models as embedding provider Dec 19, 2025
@sashimikun
Copy link

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive support for AWS Bedrock as both an LLM provider and an embedder. The changes involve updating the README.md to document new AWS environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION, AWS_ROLE_ARN) and the DEEPWIKI_EMBEDDER_TYPE option. The api/bedrock_client.py file is significantly extended to implement the Bedrock client, including handling session tokens, adding serialization methods, parsing embedding responses, and extending the call and convert_inputs_to_api_kwargs methods to support both LLM and embedding functionalities, differentiating between batch and single text embedding for various Bedrock models. Configuration files (api/config.py and api/config/embedder.json) are updated to integrate Bedrock into the embedder type detection and loading mechanisms. The api/data_pipeline.py and api/tools/embedder.py files are modified to include Bedrock in token counting and embedder selection logic, respectively. Finally, api/websocket_wiki.py is updated to enable Bedrock LLM calls via the WebSocket API, including credential checks and error handling. Unit tests in tests/unit/test_all_embedders.py are expanded to cover Bedrock configuration loading, type detection, explicit embedder creation, and embedding request body formatting. Review comments highlight a missing model_type=ModelType.LLM argument in a convert_inputs_to_api_kwargs call within api/websocket_wiki.py for Bedrock LLM requests, a non-existent method call in test_bedrock_embedder_request_format in tests/unit/test_all_embedders.py that requires refactoring, and redundant input handling logic in api/bedrock_client.py's call method for embedders.

Comment on lines 522 to 524
api_kwargs = model.convert_inputs_to_api_kwargs(
input=prompt,
model_kwargs=model_kwargs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The model_type argument is missing in this call to convert_inputs_to_api_kwargs. The BedrockClient requires this argument to correctly format the API request for either an LLM or an embedder model. Without it, the call will fail with a ValueError. Please add model_type=ModelType.LLM to the call.

            api_kwargs = model.convert_inputs_to_api_kwargs(input=prompt, model_kwargs=model_kwargs, model_type=ModelType.LLM)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model_type=ModelType.LLM is written on the next line, R525.

Comment on lines 214 to 218
body = client._format_embedding_body(
provider="amazon",
text="hello",
model_kwargs={"dimensions": 256},
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This test attempts to call client._format_embedding_body, but this method does not exist on the BedrockClient class, which will cause the test to fail. The logic for formatting the request body is currently located inside the call method. You should either refactor this test to mock sync_client.invoke_model and verify the body argument passed to it, or extract the request body formatting logic from the call method into a new _format_embedding_body method that this test can invoke.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed on this commit
bbe5a41

I deleted the client._format_embedding_body as a refactoring, but I forgot to remove test on this commit.
a968afe

Warashi and others added 2 commits December 20, 2025 09:29
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@Warashi
Copy link
Contributor Author

Warashi commented Dec 20, 2025

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive support for AWS Bedrock as both an LLM and an embedder provider. Key changes include adding new environment variables for AWS credentials (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION, AWS_ROLE_ARN) and updating documentation (README.md) to reflect these. The BedrockClient was significantly extended to handle embedding calls, including parsing responses, managing different Bedrock embedding model providers (like Cohere and Amazon Titan), and implementing serialization methods. Configuration files (api/config.py, api/config/embedder.json) were updated to integrate Bedrock into the embedder type detection and loading mechanisms. The token counting logic (api/data_pipeline.py) and the embedder selection utility (api/tools/embedder.py) were also updated to recognize 'bedrock'. Finally, the WebSocket API (api/websocket_wiki.py) was modified to allow users to select Bedrock as an LLM provider, including client initialization, API call handling, and error management with credential warnings. Unit tests (tests/unit/test_all_embedders.py) were expanded to cover the new Bedrock embedder functionality.

A review comment suggested making the Bedrock client's embedding logic more robust by explicitly handling the 'amazon' provider and raising an error for unsupported embedding providers within Bedrock, rather than using a catch-all 'else' block, to improve maintainability and prevent unexpected errors with future providers.

Comment on lines 395 to 420
else:
for text in texts:
request_body: Dict[str, Any] = {"inputText": text}

dimensions = model_kwargs.get("dimensions")
if dimensions is not None:
request_body["dimensions"] = int(dimensions)

normalize = model_kwargs.get("normalize")
if normalize is not None:
request_body["normalize"] = bool(normalize)

# Make the API call
response = self.sync_client.invoke_model(
modelId=model_id,
body=json.dumps(request_body),
)

# Parse the response
response_body = json.loads(response["body"].read())
raw_responses.append(response_body)

emb = response_body.get("embedding")
if emb is None:
raise ValueError(f"Embedding not found in response: {response_body}")
embeddings.append(emb)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The else block acts as a catch-all for any embedding provider that is not 'cohere'. This logic assumes that any other provider (like 'amazon') uses the same API format for embeddings, which might not be true for future providers. This could lead to unexpected errors if a new, unsupported embedding provider is used.

To make the code more robust and maintainable, I suggest explicitly handling the 'amazon' provider and raising an error for any other unsupported providers. This ensures that new providers are consciously added with their correct API format.

            elif provider == "amazon":
                for text in texts:
                    request_body: Dict[str, Any] = {"inputText": text}

                    dimensions = model_kwargs.get("dimensions")
                    if dimensions is not None:
                        request_body["dimensions"] = int(dimensions)

                    normalize = model_kwargs.get("normalize")
                    if normalize is not None:
                        request_body["normalize"] = bool(normalize)

                    # Make the API call
                    response = self.sync_client.invoke_model(
                        modelId=model_id,
                        body=json.dumps(request_body),
                    )

                    # Parse the response
                    response_body = json.loads(response["body"].read())
                    raw_responses.append(response_body)

                    emb = response_body.get("embedding")
                    if emb is None:
                        raise ValueError(f"Embedding not found in response: {response_body}")
                    embeddings.append(emb)
            else:
                raise NotImplementedError(f"Embedding provider '{provider}' is not supported by the Bedrock client.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed on this commit
ebf57ff

@Warashi
Copy link
Contributor Author

Warashi commented Dec 20, 2025

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces comprehensive support for AWS Bedrock as both an embedder and an LLM provider. Key changes include adding new environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_REGION, AWS_ROLE_ARN) and updating the DEEPWIKI_EMBEDDER_TYPE to include 'bedrock'. The BedrockClient was enhanced to handle Bedrock-specific embedding requests, differentiating between Amazon Titan (which requires individual calls per text) and Cohere (which supports batching), and to manage LLM calls. Serialization methods were added to BedrockClient for non-picklable attributes. Configuration files (api/config.py, api/config/embedder.json) and utility functions (api/data_pipeline.py, api/tools/embedder.py) were updated to recognize and utilize the new Bedrock embedder type. The WebSocket handler (api/websocket_wiki.py) was modified to integrate Bedrock as a chat provider, including credential checks and error handling. Unit tests were also extended to cover the new Bedrock functionality. A review comment highlighted that the current implementation for Amazon Titan embeddings in api/bedrock_client.py makes synchronous API calls in a loop for each text, suggesting that these calls should be made concurrently using asyncio.gather or a thread pool to improve performance for large batches.

Comment on lines +372 to +398
# Amazon Titan Embed Text does not support batch; send one at a time.
for text in texts:
request_body: Dict[str, Any] = {"inputText": text}

dimensions = model_kwargs.get("dimensions")
if dimensions is not None:
request_body["dimensions"] = int(dimensions)

normalize = model_kwargs.get("normalize")
if normalize is not None:
request_body["normalize"] = bool(normalize)

# Make the API call
response = self.sync_client.invoke_model(
modelId=model_id,
body=json.dumps(request_body),
)

# Parse the response
response_body = json.loads(response["body"].read())
raw_responses.append(response_body)

emb = response_body.get("embedding")
if emb is None:
raise ValueError(f"Embedding not found in response: {response_body}")
embeddings.append(emb)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The implementation for Amazon Titan embeddings iterates through the input texts and makes a synchronous API call for each one. This is inefficient for large batches of texts. While Titan embedding models may not support batching in a single API call, these calls can be made concurrently to improve performance.

Consider using an asynchronous loop with asyncio.gather or a thread pool to execute these API calls in parallel, especially within the acall method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BedrockClient doesn't support async call for now.
I think this should be addressed by another PR.

@Warashi
Copy link
Contributor Author

Warashi commented Dec 20, 2025

@sashimikun
I think I addressed the issues which I can for now from Gemini Code Assist, so please review this PR?

@sng-asyncfunc sng-asyncfunc merged commit 0f7f19a into AsyncFuncAI:main Dec 20, 2025
2 checks passed
@Warashi Warashi deleted the bedrock branch December 20, 2025 01:17
craftslab added a commit to ai-codespark/codewiki that referenced this pull request Dec 22, 2025
)

* feat(bedrock): add session token and embedding support
* feat(embedder): add bedrock type
* test(embedder): cover bedrock type
* docs: document bedrock embedder env vars
* fix(bedrock): use titan v2 embedding request schema
* fix: make BedrockClient picklable by excluding non-picklable attributes
* feat: add bedrock provider support to websocket_wiki.py
* fix: support cohere embedding response format
* refactor: remove trailing whitespace
* refactor: remove _format_embedding_body method from BedrockClient
* fix: check provider and raise when unsupported

Merged from 0f7f19a
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants