RFC-001: Canonical Schema Layer — HR-Standard-Aligned Lingua Franca

Summary

Adopt a canonical schema layer at backend/domains/canonical/ aligned with HR Open 4.5, Schema.org JobPosting, O*NET/SOC, ESCO, Merge / Finch / Apideck common models. Internal Flux models (backend/domains/hiring/schemas.py) project to and from canonical; legacy systems and external integrations are coded against canonical, never against internal Flux schemas. The canonical layer is the external contract of the platform.

Motivation

Flux integrates with many external systems: legacy ATS (JazzHR, Lever, Jobvite), HRIS / payroll (via Merge / Finch), job boards (Indeed, JobGet, VONQ), and customer-specific HRIS instances (Paychex). Coding integrations against internal Flux models couples external partners to our internal evolution and makes it impossible to support an industry-standard contract. The PRD-driving constraint is that AI agents must be able to discover and validate integration contracts autonomously — meaning the contract has to be machine-readable, standards-aligned, and stable. Internal Flux models churn (new fields, renamed enums) at a rate incompatible with that requirement.

Detailed Design

Architecture

External / Legacy Systems
      ↑↓
┌───────────────────────────────────────────────┐
│  Canonical Schema Layer                       │
│  backend/domains/canonical/                   │
│                                               │
│   schemas/         9 canonical entities       │
│   crosswalks/      enum mappings ↔ standards  │
│   projections/                                │
│     flux_*         internal Flux ↔ canonical  │
│     external/      Schema.org / Merge / Finch │
│     legacy/        per-vendor projections     │
│   api/             REST endpoints             │
│   tools/           LangChain MCP tools        │
│   external_specs/  cached vendor OpenAPI      │
└───────────────────────────────────────────────┘
      ↑↓
Internal Flux Domain Models
backend/domains/hiring/schemas.py

The 9 canonical entities are: Job, Candidate, Application, Interview, Offer, Employee, Organization, Skill, Compensation. Every canonical model carries provenance: source_system + source_id.

Data Model

Canonical models are Pydantic v2. They include:

Standards alignment fields — fields named to match HR Open 4.5 / Schema.org wherever possible (e.g., Job.employmentType aligned with Schema.org’s JobPosting.employmentType)
Crosswalk-backed enums — values map to standard codes via canonical/crosswalks/ tables (O*NET/SOC, ESCO, ISO country/currency)
Provenance — every record records where it came from
Optional Flux-extension namespace — _flux: dict for fields specific to Flux that don’t fit a standard

API Changes

New REST endpoints under /api/canonical/:

GET /canonical/schemas/{entity} — JSONSchema for the canonical entity
GET /canonical/integrations — registry of integration contracts
POST /canonical/projections/validate — validate an external payload against a canonical entity

Security Considerations

Canonical projection adapters must enforce tenant isolation — projecting a record for tenant A must never accidentally surface tenant B’s fields. The LegacyProjection protocol takes a tenant context as an explicit parameter.

Performance Considerations

Projection overhead is bounded: O(n) over the field count of the entity. For high-volume paths (job posting fanout), projections are computed once per record per channel and cached for the duration of the workflow.

Alternatives Considered

Alternative	Pros	Cons	Why Rejected
Code integrations directly against internal Flux models	Simple, no projection overhead	Couples external partners to internal evolution; cannot adopt industry standards; AI agents have no stable contract to ground against	Violates the AI-discoverability constraint
Adopt one external standard wholesale (e.g., HR Open as our internal model)	Single schema, no projection	HR Open has fields Flux doesn’t need, missing fields Flux does need; standards evolve slower than the product	Internal velocity dies; we’d be perpetually waiting for standards bodies
Per-vendor adapters with no canonical layer	Each adapter is small	N² adapter explosion as integrations grow; no shared semantics; no AI-discoverable contract	Doesn’t scale past ~10 integrations
Generate canonical schemas from external specs (auto)	Less manual schema authoring	External specs vary in quality; generated schemas would be unstable; no opportunity to enforce Flux conventions	Loses schema design control

Migration Strategy

The canonical layer is additive. No breaking changes to internal Flux models. Migration:

Land canonical schemas + projection protocols (this RFC)
New integrations always go through canonical
Existing JazzHR / Lever / Jobvite integrations gradually migrated to LegacyProjection
Internal Flux model evolution continues independently — projection adapters absorb the change

Validation Plan

All 9 canonical entities defined in canonical/schemas/
At least one external projection per direction (Schema.org JobPosting, Merge candidate)
At least one legacy projection (JazzHR job)
Internal Flux Job ↔ canonical Job projection round-trips (no data loss)
AI tool query_canonical_schema returns valid JSONSchema for each entity
Integration contract registry is queryable via MCP

Risks

Risk	Likelihood	Impact	Mitigation
Canonical schemas drift from internal Flux models	Medium	Medium	Round-trip test gate in CI; projection adapter is the sync point
External standards evolve and break compatibility	Low	Medium	Pin to a major version; bump cycles tracked as RFCs
Projection performance degrades on high-volume paths	Low	Medium	Benchmark per-entity; cache per-workflow; fall back to direct internal model in non-external paths

Open Questions

#	Question	Owner	Target Date	Resolution
1	Should canonical models be exposed via GraphQL in addition to REST?	@pj	2026-06-01	Defer until a customer asks

Decision

Accepted by @pj on 2026-04-18 as the foundation for all external integrations and legacy system projections going forward. New integration cycles cite this RFC in source_rfcs. No new integration code may bypass the canonical layer without a superseding RFC.

This RFC established the canonical schema layer pattern. All integration cycles inherit this contract.

Process

Templates

Technical RFCs

RFC-001: Canonical Schema Layer — HR-Standard-Aligned Lingua Franca

RFC-001: Canonical Schema Layer — HR-Standard-Aligned Lingua Franca

Summary

Motivation

Detailed Design

Architecture

Data Model

API Changes

Security Considerations

Performance Considerations

Alternatives Considered

Migration Strategy

Validation Plan

Risks

Open Questions

Decision

Process

Templates

Technical RFCs

​RFC-001: Canonical Schema Layer — HR-Standard-Aligned Lingua Franca

​Summary

​Motivation

​Detailed Design

​Architecture

​Data Model

​API Changes

​Security Considerations

​Performance Considerations

​Alternatives Considered

​Migration Strategy

​Validation Plan

​Risks

​Open Questions

​Decision

RFC-001: Canonical Schema Layer — HR-Standard-Aligned Lingua Franca

Summary

Motivation

Detailed Design

Architecture

Data Model

API Changes

Security Considerations

Performance Considerations

Alternatives Considered

Migration Strategy

Validation Plan

Risks

Open Questions

Decision