simplify workflow history and add page tracking #354

hanna-paasivirta · 2026-01-15T09:34:18Z

Short Description

Supports merging workflow and job code chats OpenFn/lightning#4109

Adds page navigation tracking to both job_chat and workflow_chat services to improve context awareness and auto-refresh documentation when users navigate between different jobs or workflows. When users switch pages (different job, workflow, or adaptor), the conversation history shows page context via prefixes like [pg:job_code/Transform data/language-common], and job_chat automatically refreshes RAG documentation to ensure relevant adaptor docs are retrieved.

Also updates from claude-sonnet-4 to claude-sonnet-4-5-20250929

Fixes #350

Implementation Details

Page Context Construction

Services now construct page metadata from the context payload:
- job_chat: Extracts current page_name and adaptor from context.page_name and context.adaptor, sets type to "job_code"
- workflow_chat: Extracts current page_name from context.page_name, sets type to "workflow"
Gracefully handles missing page data - page information is not necessary

History Prefixing

add_page_prefix() helper function prefixes user messages in history with format: [pg:type/name/adaptor] or [pg:type/name]
- Example: [pg:job_code/Get FHIR data/language-fhir@4.6.10] (job_chat with adaptor)
- Example: [pg:workflow/fridge-statistics-processing] (workflow_chat)
Helper handles partial data gracefully - builds prefix from available fields only
The model receives prefixed content in message history for better context awareness

Simplified History Format

Both services now store only text responses in history (no nested JSON structures in workflow_chat showing previously edited YAMLs)
Example assistant message: "I'll add error handling..." instead of {"text": "I'll add error handling...", "yaml": "..."}

Navigation Detection & RAG Auto-refresh

Added has_navigated() helper to job_chat to detect page changes
Compares current_page (constructed from context) to the last user conversation turn prefix
Navigation detected when type, name, OR adaptor changes
When navigation is detected, job_chat automatically sets refresh_rag=true to fetch fresh documentation for the new page/adaptor
Prevents stale documentation when switching between adaptors, adaptor versions or jobs

Backward Compatibility

job_chat always includes meta key with rag data (maintains existing behaviour)
workflow_chat conditionally includes meta only when page data is present (no breaking change as meta was never returned before)
Added context field to workflow_chat Payload (previously only in job_chat)
All page-related fields are optional - services work normally without them

Prompt Changes

Explain prefixes, attachment redaction and that the user may have navigated between jobs/workflows.

Tests Added

Basic tests to check conversation flow and that prefixes are added. I struggled to draft a good test to check that the model won't confuse different jobs/workflows and is aware of the context changing.

Anthropic's structured outputs NOT used

Tried Anthropic's structured outputs beta feature. It did not work as expected (answers keep getting stuck in loops) so removed these. See Use Anthropic structured outputs #310

Requests for Lightning:

workflow_chat service calls: Add a context key in workflow_chat calls, with a current_page key. This should just be the name of the workflow/step, like 'fetch-fhir-data'.
job_chat service calls: Similarly, add a current_page key to the context key

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

You can read more details in our Responsible AI Policy

josephjclark · 2026-01-15T10:21:14Z

services/job_chat/job_chat.py

+        return content
+
+    prefix_parts = [page.get('type', ''), page.get('name', '')]
+    if page.get('adaptor'):


put page.get(name) into an if as well

We should know the page type because in job chat, the page type is always job chat

hanna-paasivirta · 2026-01-20T15:55:17Z

Just added a final change to make sure the prefix is always added for consistency.

josephjclark

Ok, a few observations. The review is a bit scattered so here are the important bits I think:

Having looked closely and seen the code, I'm not sold on the prefix. I think we should just save { url, meta } keys on the chat history
I don;t think Lightning should send us last_page, it should just send us the current_page. And I think it's easiest for everyone if it just sends the URL
When re-running rag, we're starting to conflate "run rag because we've changed agents" with "re-run rag because the adaptor changed within a session".

josephjclark · 2026-01-21T13:48:49Z

services/job_chat/job_chat.py

 @dataclass
 class ChatConfig:
-    model: str = "claude-sonnet-4-20250514"
+    model: str = "claude-sonnet-4-5-20250929"


I'd like any model changes to be surfaced in PRs and release notes. I know it''s not likely to change much but I want really clear disclosure on it

josephjclark · 2026-01-21T13:50:41Z

services/job_chat/job_chat.py


+        # Construct current_page from context
+        # Always create page with type, extract name/adaptor from context if available
+        page_name = data.context.get("page_name") if data.context else None


Can we make sure that data.context gets set to an empty dict early on? Would remove a lot of complexity from this code

josephjclark · 2026-01-21T13:52:20Z

services/job_chat/job_chat.py

+        }
+        # Only add adaptor if we successfully extracted it
+        if adaptor_short_name:
+            current_page["adaptor"] = adaptor_short_name


The adaptor_short_name var isn't really needed. If we declare current_page a bit earlier, we can just assign the adaptor key at the end of the try block - instead of assigning the var, just assign the key. Leaves you with shorter, cleaner code

services/job_chat/job_chat.py

josephjclark · 2026-01-21T14:52:21Z

services/job_chat/tests/test_qualitative.py

+    }
+
+    # Meta shows navigation happened
+    meta = {


I don't think meta should show the last page - just the current one. The last page is implied in the history.

I also want to be careful about what assumptions we're making in the last page data. I think the app should just send us a URL really?

josephjclark · 2026-01-21T14:59:14Z

services/job_chat/tests/test_qualitative.py

+    assert not any(word in response_text for word in ["workflow", "yaml", "trigger", "edge"]), \
+        "Response should be about job code, not workflow structure"
+
+    print("\n✓ Navigation test passed: Model correctly inferred navigation from workflow to job editor")


What about this test tells us that the history was correctly interpreted? Would we have had a different/worse result without rag?

josephjclark · 2026-01-21T15:02:28Z

services/job_chat/tests/test_qualitative.py

+    assert response["suggested_code"] is not None, "Model should have generated code for the job"
+
+    # Verify logging was added to the code
+    assert "console.log" in response["suggested_code"], "Log statement not found in suggested code"


The interesting thing about this example is that I would prefer to see the common log function be used, rather than console.log.

If anything I'd take that as evidence that rag has NOT run - it's not picking up the dedicated log function.

Or maybe it's just AI being AI. Not damning feedback, but also not a resounding success for me

SO we'll just try and trick the model:

ask a question like "how do I get data"

expect the model to refer to the OG adaptor, eg salesforce, in the adaptor

change page and ask the same question again

the response should now name drop the new adaptor, eg dhis2

services/util.py

josephjclark · 2026-01-22T10:49:28Z

services/job_chat/job_chat.py

+
            updated_history = history + [
-                {"role": "user", "content": content},
+                {"role": "user", "content": prefixed_content},


Ok, so we'll add more metadata to the history objects here. Don't worry about tidying up or bloating history for now (we should only be adding a very small amount of data)

For now add the page prefix (as page?). Maybe adaptor stuff but that can be handled later

hanna-paasivirta · 2026-01-22T23:24:18Z

The Anthropic messages object in history has a strict format with a set of standard keys that don't include custom metadata. I have removed the last_page key from meta, and instead implemented fetching the last page info from the last user turn in history with string matching.

simplify workflow history and add page tracking

ce9788d

josephjclark reviewed Jan 15, 2026

View reviewed changes

hanna-paasivirta added 6 commits January 15, 2026 18:44

construct current page data

372209a

add structured outputs

bded49e

handle missing data

6166dad

remove structured outputs

3eba2ff

tweak prompt and add tests

a6f23a4

simplify test

4dd0a3f

hanna-paasivirta mentioned this pull request Jan 20, 2026

Use Anthropic structured outputs #310

Open

hanna-paasivirta marked this pull request as ready for review January 20, 2026 09:58

hanna-paasivirta added 4 commits January 20, 2026 13:38

Merge branch 'main' into combined-assistants

186f467

simplify test

5ca3eab

add changeset

dc73aa1

always add page prefix regardless of missing info

175104a

josephjclark reviewed Jan 21, 2026

View reviewed changes

josephjclark reviewed Jan 22, 2026

View reviewed changes

get last turn from history

ec8b562

simplify workflow history and add page tracking #354

Are you sure you want to change the base?

simplify workflow history and add page tracking #354

Uh oh!

Conversation

hanna-paasivirta commented Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Short Description

Implementation Details

Page Context Construction

History Prefixing

Simplified History Format

Navigation Detection & RAG Auto-refresh

Backward Compatibility

Prompt Changes

Tests Added

Anthropic's structured outputs NOT used

Requests for Lightning:

AI Usage

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanna-paasivirta commented Jan 20, 2026

Uh oh!

josephjclark left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josephjclark Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hanna-paasivirta commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hanna-paasivirta commented Jan 15, 2026 •

edited

Loading

josephjclark Jan 21, 2026 •

edited

Loading