Skip to content

Conversation

@AmirLayegh
Copy link
Contributor

@AmirLayegh AmirLayegh commented Dec 18, 2025

Description

Add support for extracting required property field from SchemaFromTextExtractor.

This feature enables the LLM to identify which properties are mandatory (must exist on every node) versus optional (may be absent).

Changes:

  • Updated SchemaExtractionTemplate prompt with Rule 9 (REQUIRED PROPERTIES) to guide the LLM in identifying required vs optional properties
  • Added _filter_properties_required_field() method to sanitize required values from LLM responses (handles boolean, string conversions, and invalid types)
  • Added _enforce_required_for_constraint_properties() method to auto-set required: true for properties with UNIQUENESS constraints.
  • Added unit tests for the filtering method
  • Added unit tests for the constraint enforcement against required property
  • Added integration tests for the full extraction pipeline

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Documentation update
  • Project configuration change

Complexity

Complexity: Low

How Has This Been Tested?

  • Unit tests
  • E2E tests
  • Manual tests

Checklist

The following requirements should have been met (depending on the changes in the branch):

  • Documentation has been updated
  • Unit tests have been updated
  • E2E tests have been updated
  • Examples have been updated
  • New files have copyright header
  • CLA (https://neo4j.com/developer/cla/) has been signed
  • CHANGELOG.md updated if appropriate

@AmirLayegh AmirLayegh marked this pull request as ready for review December 18, 2025 14:23
@AmirLayegh AmirLayegh requested a review from a team as a code owner December 18, 2025 14:23
Copy link
Contributor

@stellasia stellasia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments, but look very good

prop_name = prop.get("name", "unknown")
node_label = node_type.get("label", "unknown")

if isinstance(required_value, str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could stringify the required_value to also cover int values for instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I'll stringify it to handle int values.

if label and prop:
constraint_props.setdefault(label, set()).add(prop)

# Skop node_types without constraints
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo

Suggested change
# Skop node_types without constraints
# Skip node_types without constraints

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants