RFC-001: Canonical Schema Layer — HR-Standard-Aligned Lingua Franca
Summary
Adopt a canonical schema layer atbackend/domains/canonical/ aligned with HR Open 4.5, Schema.org JobPosting, O*NET/SOC, ESCO, Merge / Finch / Apideck common models. Internal Flux models (backend/domains/hiring/schemas.py) project to and from canonical; legacy systems and external integrations are coded against canonical, never against internal Flux schemas. The canonical layer is the external contract of the platform.
Motivation
Flux integrates with many external systems: legacy ATS (JazzHR, Lever, Jobvite), HRIS / payroll (via Merge / Finch), job boards (Indeed, JobGet, VONQ), and customer-specific HRIS instances (Paychex). Coding integrations against internal Flux models couples external partners to our internal evolution and makes it impossible to support an industry-standard contract. The PRD-driving constraint is that AI agents must be able to discover and validate integration contracts autonomously — meaning the contract has to be machine-readable, standards-aligned, and stable. Internal Flux models churn (new fields, renamed enums) at a rate incompatible with that requirement.Detailed Design
Architecture
Job, Candidate, Application, Interview, Offer, Employee, Organization, Skill, Compensation.
Every canonical model carries provenance: source_system + source_id.
Data Model
Canonical models are Pydantic v2. They include:- Standards alignment fields — fields named to match HR Open 4.5 / Schema.org wherever possible (e.g.,
Job.employmentTypealigned with Schema.org’sJobPosting.employmentType) - Crosswalk-backed enums — values map to standard codes via
canonical/crosswalks/tables (O*NET/SOC, ESCO, ISO country/currency) - Provenance — every record records where it came from
- Optional Flux-extension namespace —
_flux: dictfor fields specific to Flux that don’t fit a standard
API Changes
New REST endpoints under/api/canonical/:
GET /canonical/schemas/{entity}— JSONSchema for the canonical entityGET /canonical/integrations— registry of integration contractsPOST /canonical/projections/validate— validate an external payload against a canonical entity
Security Considerations
Canonical projection adapters must enforce tenant isolation — projecting a record for tenant A must never accidentally surface tenant B’s fields. TheLegacyProjection protocol takes a tenant context as an explicit parameter.
Performance Considerations
Projection overhead is bounded: O(n) over the field count of the entity. For high-volume paths (job posting fanout), projections are computed once per record per channel and cached for the duration of the workflow.Alternatives Considered
| Alternative | Pros | Cons | Why Rejected |
|---|---|---|---|
| Code integrations directly against internal Flux models | Simple, no projection overhead | Couples external partners to internal evolution; cannot adopt industry standards; AI agents have no stable contract to ground against | Violates the AI-discoverability constraint |
| Adopt one external standard wholesale (e.g., HR Open as our internal model) | Single schema, no projection | HR Open has fields Flux doesn’t need, missing fields Flux does need; standards evolve slower than the product | Internal velocity dies; we’d be perpetually waiting for standards bodies |
| Per-vendor adapters with no canonical layer | Each adapter is small | N² adapter explosion as integrations grow; no shared semantics; no AI-discoverable contract | Doesn’t scale past ~10 integrations |
| Generate canonical schemas from external specs (auto) | Less manual schema authoring | External specs vary in quality; generated schemas would be unstable; no opportunity to enforce Flux conventions | Loses schema design control |
Migration Strategy
The canonical layer is additive. No breaking changes to internal Flux models. Migration:- Land canonical schemas + projection protocols (this RFC)
- New integrations always go through canonical
- Existing JazzHR / Lever / Jobvite integrations gradually migrated to
LegacyProjection - Internal Flux model evolution continues independently — projection adapters absorb the change
Validation Plan
- All 9 canonical entities defined in
canonical/schemas/ - At least one external projection per direction (Schema.org JobPosting, Merge candidate)
- At least one legacy projection (JazzHR job)
- Internal Flux Job ↔ canonical Job projection round-trips (no data loss)
- AI tool
query_canonical_schemareturns valid JSONSchema for each entity - Integration contract registry is queryable via MCP
Risks
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Canonical schemas drift from internal Flux models | Medium | Medium | Round-trip test gate in CI; projection adapter is the sync point |
| External standards evolve and break compatibility | Low | Medium | Pin to a major version; bump cycles tracked as RFCs |
| Projection performance degrades on high-volume paths | Low | Medium | Benchmark per-entity; cache per-workflow; fall back to direct internal model in non-external paths |
Open Questions
| # | Question | Owner | Target Date | Resolution |
|---|---|---|---|---|
| 1 | Should canonical models be exposed via GraphQL in addition to REST? | @pj | 2026-06-01 | Defer until a customer asks |
Decision
Accepted by @pj on 2026-04-18 as the foundation for all external integrations and legacy system projections going forward. New integration cycles cite this RFC insource_rfcs. No new integration code may bypass the canonical layer without a superseding RFC.
This RFC established the canonical schema layer pattern. All integration cycles inherit this contract.