Skip to content

Conversation

@inoray
Copy link
Collaborator

@inoray inoray commented Jan 2, 2026

  • hwpx 파싱할 때 일반 텍스트를 paragraph로 가져오고 있음.
  • toc 결과와 매칭할 때 paragraph를 비교 대상에 포함시킴

Checklist:

  • Documentation has been updated, if necessary.
  • Examples have been added, if necessary.
  • Tests have been added, if necessary.

- hwpx 파싱할 때 일반 텍스트를 paragraph로 가져오고 있음.
- toc 결과와 매칭할 때 paragraph를 비교 대상에 포함시킴
@inoray inoray linked an issue Jan 2, 2026 that may be closed by this pull request
@gemini-code-assist
Copy link

Summary of Changes

Hello @inoray, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

이 Pull Request는 HWPX 문서 처리 과정에서 단일 청크만 생성되던 버그를 수정합니다. 이는 일반 텍스트가 파싱 과정에서 paragraph로 처리될 때, 목차(TOC) 매칭 로직에서 해당 paragraph 항목이 비교 대상에서 누락되었기 때문입니다. 이번 변경으로 paragraph 레이블을 가진 항목들이 TOC 매칭에 포함되어, HWPX 문서 내에서 여러 청크가 올바르게 식별되고 처리될 수 있도록 문서 강화의 정확성을 향상시킵니다.

Highlights

  • HWPX 파싱 개선: HWPX 파일 파싱 시 일반 텍스트를 paragraph로 처리하도록 변경되었습니다.
  • TOC 매칭 로직 확장: 목차(TOC) 매칭 로직에서 paragraph 타입의 문서 항목도 비교 대상에 포함되도록 수정되었습니다.
  • 로그 메시지 개선: apply_toc_enrichmentapply_law_toc_enrichment 함수에 대한 로그 메시지가 더 명확해졌습니다.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

hwpx 파일 파싱 시 PARAGRAPH 라벨을 포함하도록 수정하여 TOC 매칭이 정상적으로 동작하도록 한 변경 사항을 확인했습니다. 또한, 디버깅 편의성을 위해 로그 메시지에 함수명을 추가한 점도 좋은 개선입니다. 전반적으로 코드의 정확성과 유지보수성을 높이는 좋은 변경이라고 생각합니다. 한 가지 가독성 개선을 위한 제안을 리뷰 댓글로 남겼습니다.

Comment on lines 945 to 946
if (isinstance(item, TextItem) or isinstance(item, ListItem))
and (item.label == DocItemLabel.TEXT or item.label == DocItemLabel.LIST_ITEM or item.label == DocItemLabel.PAGE_HEADER)
and (item.label in [DocItemLabel.TEXT, DocItemLabel.LIST_ITEM, DocItemLabel.PAGE_HEADER, DocItemLabel.PARAGRAPH])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

가독성 향상을 위해 isinstance 조건을 개선하는 것을 제안합니다. 여러 isinstance 호출을 튜플을 사용하여 하나로 합치면 코드가 더 간결해집니다.

Suggested change
if (isinstance(item, TextItem) or isinstance(item, ListItem))
and (item.label == DocItemLabel.TEXT or item.label == DocItemLabel.LIST_ITEM or item.label == DocItemLabel.PAGE_HEADER)
and (item.label in [DocItemLabel.TEXT, DocItemLabel.LIST_ITEM, DocItemLabel.PAGE_HEADER, DocItemLabel.PARAGRAPH])
if isinstance(item, (TextItem, ListItem))
and item.label in [DocItemLabel.TEXT, DocItemLabel.LIST_ITEM, DocItemLabel.PAGE_HEADER, DocItemLabel.PARAGRAPH]

@inoray inoray requested a review from JaeseungYang January 2, 2026 04:46
@inoray inoray merged commit 1d0c217 into develop Jan 2, 2026
2 checks passed
@inoray inoray deleted the 140-hwpx-chunk-1 branch January 2, 2026 04:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hwpx, 청크 1개만 생성되는 버그

3 participants