Skip to content

Conversation

@suluyana
Copy link
Collaborator

Change Summary

Related issue number

Checklist

  • The pull request title is a good summary of the changes - it will be used in the changelog
  • Unit tests for the changes exist
  • Run pre-commit install and pre-commit run --all-files before git commit, and passed lint check.
  • Documentation reflects the changes where applicable

@gemini-code-assist
Copy link

Summary of Changes

Hello @suluyana, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a comprehensive token usage tracking mechanism for the LLM agent. It introduces global counters for prompt and completion tokens, protected by an asynchronous lock, to provide an aggregated view of token consumption throughout the application's lifecycle. Additionally, it ensures that usage data is consistently captured from OpenAI API calls, including streaming responses, and logs this information at each step of the agent's operation.

Highlights

  • Global Token Tracking: Introduced global variables (TOTAL_PROMPT_TOKENS, TOTAL_COMPLETION_TOKENS) and an asynchronous lock (TOKEN_LOCK) to accumulate total prompt and completion token usage across the application, ensuring thread-safe updates.
  • LLM Agent Usage Logging: Enhanced the LLMAgent's step method to extract prompt_tokens and completion_tokens from LLM responses, update the global accumulators, and log both per-step and cumulative token usage.
  • OpenAI Stream Usage Configuration: Modified the openai_llm to explicitly request usage information (include_usage=True) when performing streaming API calls to OpenAI, ensuring comprehensive token tracking even in streaming scenarios.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces token usage tracking for LLM calls, accumulating prompt and completion tokens at a process level. The implementation in llm_agent.py uses global variables for tracking, which I've recommended refactoring into class attributes on LLMAgent to improve encapsulation and testability. I've provided specific suggestions on how to implement this change. Additionally, I've suggested a minor readability improvement in openai_llm.py for the logic that enables usage data in streaming responses.

Comment on lines +29 to +32
# Current process shared
TOTAL_PROMPT_TOKENS = 0
TOTAL_COMPLETION_TOKENS = 0
TOKEN_LOCK = asyncio.Lock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using global variables for process-wide state can lead to code that is hard to test and maintain. It's better to encapsulate this state within the LLMAgent class itself as class attributes. This clearly associates the state with the agent and avoids polluting the global namespace.

For example, you could define them inside LLMAgent like this:

class LLMAgent(Agent):
    TOTAL_PROMPT_TOKENS = 0
    TOTAL_COMPLETION_TOKENS = 0
    TOKEN_LOCK = asyncio.Lock()
    ...

Comment on lines +478 to +494
prompt_tokens = _response_message.prompt_tokens
completion_tokens = _response_message.completion_tokens

global TOTAL_PROMPT_TOKENS, TOTAL_COMPLETION_TOKENS, TOKEN_LOCK
async with TOKEN_LOCK:
TOTAL_PROMPT_TOKENS += prompt_tokens
TOTAL_COMPLETION_TOKENS += completion_tokens

# tokens in the current step
self.log_output(
f'[usage] prompt_tokens: {prompt_tokens}, completion_tokens: {completion_tokens}'
)
# total tokens for the process so far
self.log_output(
f'[usage] prompt_tokens: {_response_message.prompt_tokens}, '
f'completion_tokens: {_response_message.completion_tokens}')
f'[usage_total] total_prompt_tokens: {TOTAL_PROMPT_TOKENS}, '
f'total_completion_tokens: {TOTAL_COMPLETION_TOKENS}')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Following my suggestion to move the tracking variables to be class attributes of LLMAgent, this block should be updated to access them via the class. This avoids the global statement and makes it clear that you're modifying the shared state of the LLMAgent class.

        prompt_tokens = _response_message.prompt_tokens
        completion_tokens = _response_message.completion_tokens

        # 使用全局累积
        async with LLMAgent.TOKEN_LOCK:
            LLMAgent.TOTAL_PROMPT_TOKENS += prompt_tokens
            LLMAgent.TOTAL_COMPLETION_TOKENS += completion_tokens

        # tokens in the current step
        self.log_output(
            f'[usage] prompt_tokens: {prompt_tokens}, completion_tokens: {completion_tokens}'
        )
        # total tokens for the process so far
        self.log_output(
            f'[usage_total] total_prompt_tokens: {LLMAgent.TOTAL_PROMPT_TOKENS}, '
            f'total_completion_tokens: {LLMAgent.TOTAL_COMPLETION_TOKENS}')

Comment on lines +135 to +137
if kwargs.get('stream', False) and self.args.get(
'stream_options', {}).get('include_usage', True):
kwargs.setdefault('stream_options', {})['include_usage'] = True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This conditional logic is a bit dense and could be hard to parse. For better readability, you could break it down into a few lines with intermediate variables and a comment explaining the intent.

Suggested change
if kwargs.get('stream', False) and self.args.get(
'stream_options', {}).get('include_usage', True):
kwargs.setdefault('stream_options', {})['include_usage'] = True
is_streaming = kwargs.get('stream', False)
stream_options_config = self.args.get('stream_options', {})
# For streaming responses, we should request usage statistics by default,
# unless it's explicitly disabled in the configuration.
if is_streaming and stream_options_config.get('include_usage', True):
kwargs.setdefault('stream_options', {})['include_usage'] = True

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant