Addition sentence info into DocumentAssembler output in LightPipeline #14714

mehmetbutgul · 2025-12-22T13:44:14Z

Description

Added sentence metadata (Map("sentence" -> "0")) to the DocumentAssembler output in LightPipeline.
This ensures that sentence information is consistently present in the annotations produced by LightPipeline.

Motivation and Context

Previously, transform() / fullAnnotate() and LightPipeline were producing different metadata outputs for documents.
By adding the default sentence metadata in LightPipeline, this change eliminates the inconsistency and guarantees identical metadata across Pipeline and LightPipeline executions.

How Has This Been Tested?

Added a dedicated Scala unit test covering this behavior
Verified that all existing LightPipeline tests pass successfully

Screenshots (if appropriate):

Types of changes

Bug fix (non-breaking change which fixes an issue)
Code improvements with no or little impact
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have read the CONTRIBUTING page.
I have added tests to cover my changes.
All new and existing tests passed.

mehmetbutgul added 2 commits December 22, 2025 16:12

added sentence info to DocumentAssembler metadata in LightPipeline

33fc227

added unit test to check sentence info

087b8bb

mehmetbutgul changed the title ~~2025 12 22 sentence idx in light pipeline~~ Addition sentence info into DocumentAssembler output in LightPipeline Dec 22, 2025

mehmetbutgul assigned DevinTDHa and danilojsl Dec 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Addition sentence info into DocumentAssembler output in LightPipeline #14714

Addition sentence info into DocumentAssembler output in LightPipeline #14714

mehmetbutgul commented Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Addition sentence info into DocumentAssembler output in LightPipeline #14714

Are you sure you want to change the base?

Addition sentence info into DocumentAssembler output in LightPipeline #14714

Conversation

mehmetbutgul commented Dec 22, 2025

Description

Motivation and Context

How Has This Been Tested?

Screenshots (if appropriate):

Types of changes

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants