Skip to content

Unified source handling for all lang/scope combinations with modular sources (Phase 2) #11612

@ascheman

Description

@ascheman

New feature, improvement proposal

Summary

A proposal on Build Sources Validation (originally started by mail on the dev list) was partly implemented as Phase 1 (Module-aware resource handling) by PR #11505.
This issue addresses Phase 2: extending the unified handling approach to all <lang> and <scope> combinations, not just resources.

Additional Motivation

As noted by @desruisseaux in PR #11505 comments:

I thought that what this PR does for resources, the same thing could be done for tests. I'm thinking to refactor the ResourceHandler class in something that does not handle resources in a special way, but instead does the same thing in a uniform way for all combinations of <lang> and <scope>.

Challenge: Multiple Source Directories

Allowing multiple source directories (removing R3, see below) raises questions about data structures and handling:

  1. Multiple directories per (lang, scope, module): Users may legitimately configure multiple <source> directories for the same combination, e.g.:

    <sources>
      <!-- Implicit directory: src/com.example.app/main/java -->
      <source><lang>java</lang><scope>main</scope><module>com.example.app</module></source>
      <!-- Additional generated sources for the same module -->
      <source><lang>java</lang><scope>main</scope><module>com.example.app</module>
        <directory>target/generated-sources/com.example.app/java</directory></source>
    </sources>
  2. Duplicate handling strategy: The <source> element has additional attributes (enabled, targetPath, stringFiltering, includes, excludes, targetVersion). When the same (lang, scope, module, directory) tuple appears multiple times, we need a defined handling strategy:

    <source><lang>java</lang><scope>main</scope><module>com.example</module>
      <directory>src/com.example/main/java</directory>
      <enabled>true</enabled></source>
    <source><lang>java</lang><scope>main</scope><module>com.example</module>
      <directory>src/com.example/main/java</directory>
      <enabled>false</enabled></source>  <!-- How to handle? -->

    Proposed strategy: enabled as key discriminator

    • Use (lang, scope, module, directory) as identity
    • First entry with enabled=true wins and defines the effective configuration
    • Subsequent entries with same identity AND enabled=true trigger a WARNING (alternative: ERROR, since user could simply remove the duplicate)
    • Entries with enabled=false are harmless – they explicitly disable that source
    • Duplicate detection applies only to enabled sources – disabled sources are effectively no-ops for tracking

    Note on current implementation: The current implementation already filters by enabled when retrieving sources via MavenProject.getEnabledSourceRoots(). However, during source processing in DefaultProjectBuilder, all sources are added regardless of enabled status, and the tracking flags (hasMain, hasTest, etc.) don't currently consider enabled. Phase 2 should align the processing logic with the retrieval logic by only tracking enabled sources.

  3. Target-oriented view: Regardless of input structure, all sources eventually compile to target directories:

    • target/classes/<module>/ (main scope)
    • target/test-classes/<module>/ (test scope)

    The compiler/processor handles multiple input directories naturally – they all feed into the same output location.

The appropriate data structure and algorithm to support this should be discussed further before implementation.

Proposed Changes

1. Generalize ResourceHandlingContext to SourceHandlingContext

Replace the current ResourceHandlingContext approach with a flexible, unified source handling mechanism.

Current (Phase 1) - extended boolean flags (Pseudo-code):

// Hardcoded tracking for specific combinations
boolean hasMain = false;
boolean hasTest = false;
boolean hasMainResources = false;
boolean hasTestResources = false;

for (var source : sources) {
    if (lang == JAVA && scope == MAIN) hasMain = true;
    if (lang == JAVA && scope == TEST) hasTest = true;
    if (lang == RESOURCES && scope == MAIN) hasMainResources = true;
    if (lang == RESOURCES && scope == TEST) hasTestResources = true;
}

// Resource-specific handling
context.handleResourceConfiguration(MAIN, hasMainResources);
context.handleResourceConfiguration(TEST, hasTestResources);

Proposed (Phase 2) – Flexible set-based tracking:

// Generic tracking for ANY lang/scope/module/directory combination
record SourceKey(Language lang, ProjectScope scope, String module, Path directory) {}
Set<SourceKey> declaredSources = new HashSet<>();

// We need to handle some special cases, e.g., implicit directory etc.
for (var source : sources) {
    SourceKey key = new SourceKey(lang, scope, module, directory);
    if (source.isEnabled()) {
        if (declaredSources.contains(key)) {
            // Duplicate enabled source - emit WARNING (or ERROR)
            warn("Duplicate enabled source for " + key);
        } else {
            declaredSources.add(key);
        }
    }
    // disabled sources are not tracked - they are explicitly disabled by the user
    // and won't be returned by getEnabledSourceRoots() anyway
}

// Unified handling for ALL source types (java, resources, kotlin, etc.)
for (SourceKey key : declaredSources) {
    // Same logic for java, resources, kotlin, groovy, etc.
    handleSourceConfiguration(key.lang(), key.scope(), key.module(), key.directory());
}

Benefits:

  • Replace hardcoded boolean flags with a flexible set structure
  • Generalize handling logic (currently resource-only) to work for ALL lang/scope combinations
  • Use ONE unified approach instead of special-casing resources
  • Easily extensible for new languages (kotlin, groovy, scala, etc.)

2. Remove Validation Rule R3 (Duplicate Sources)

Per @desruisseaux suggestion in his mailing list reply:

Remove rule R3, i.e., accept duplicated source for scope='X', lang='Y', module='Z'. The compiler plugin 4 already accepts that, the need for multi-source directories in Maven 3 was strong enough to justify an external plugin, and Maven 3 already accepts that for resources if I remember right.

Rationale:

  • Maven Compiler Plugin 4 already supports multiple source directories
  • The build-helper-maven-plugin exists specifically because users needed multi-source support
  • Maven 3 already allows duplicate resource directories

3. Unified Source Directory Handling

Apply the same modular source detection logic from Phase 1 to:

  • Java sources (src/<module>/main/java, src/<module>/test/java)
  • Test sources
  • Other language sources (kotlin, groovy, scala, etc.)

Implementation Notes

Files to Modify

  • impl/maven-core/src/main/java/org/apache/maven/project/ResourceHandlingContext.java

    • Rename/refactor to SourceHandlingContext
    • Generalize handleResourceConfiguration() to handle all source types
    • Add shouldAddSource() for duplicate detection
    • Add hasSources(Language, ProjectScope) to replace boolean flags
  • impl/maven-core/src/main/java/org/apache/maven/project/DefaultProjectBuilder.java

    • Replace hardcoded boolean flags (hasMain, hasTest, etc.) with SourceHandlingContext.hasSources()
    • Use SourceHandlingContext.shouldAddSource() for duplicate detection

No modification necessary (due to dropped R3)

  • impl/maven-impl/src/main/java/org/apache/maven/impl/model/DefaultModelValidator.java
    • R3 validation (duplicate sources detection) was originally proposed but never implemented in the validator
    • Duplicate detection is handled in SourceHandlingContext.shouldAddSource() instead
    • R1, R2, R4 validations remain unchanged

Validation Rules After Phase 2

Rule Description Addresses Status
R1 <sources> conflicts with explicit <sourceDirectory> - Keep
R2 Mixed modular/non-modular sources Problem 1 Keep
R3 Duplicate sources warning - Remove
R4 Filesystem warning for classic directory Problem 4 Keep

Note: Problem 2 (Test-Only Modular Projects) and Problem 3 (Resources Are Not Module-Aware) were addressed by source injection logic in Phase 1.

Acceptance Criteria

# Criterion Description
AC1 Boolean flags eliminated The hardcoded hasMain, hasTest, hasMainResources, hasTestResources, hasScript flags are replaced with flexible set-based tracking via SourceHandlingContext.hasSources()
AC2 Unified source tracking All lang/scope combinations use the same tracking mechanism (not just resources)
AC3 Duplicate detection Duplicate enabled sources with same (lang, scope, module, directory) identity trigger a WARNING
AC4 First enabled wins Only the first enabled source for an identity is added to the project; duplicates are skipped
AC5 Disabled sources unchanged Disabled sources are still added but filtered by getEnabledSourceRoots()

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions