Canonical Schema & Integration-First Strategy
Owner: @pj (CTO, Employ Inc.) Created: 2026-03-29 Status: Active Implements: Cycle 210 (Canonical Schema Platform)Strategic Context
Employ Inc. operates multiple ATS products (JazzHR, Lever, Jobvite, Recruit Marketing) serving different market segments. The company is building Flux as the next-generation AI-native hiring platform. Rather than forcing customers through painful data migrations, the platform uses a strangler pattern: a canonical data layer sits between legacy systems and modern experiences, allowing customers to upgrade without migrating. The Canonical Schema Platform is the data contract backbone that makes this architecture work.Core Principle: Customers Upgrade, They Don’t Migrate
When a JazzHR customer “upgrades” to the Modern ATS UI, their data stays in JazzHR’s backend. The Canonical API Event Platform + Tenant Router serves the same API surface by projecting JazzHR data through the canonical schema layer. Over time, data can be incrementally moved to the Canonical ATS Data & BE, but the customer never experiences a migration — features just appear.Core Principle: Integration as a Primary Capability
Every external system — HRIS, payroll, job boards, background checks, assessments — connects through canonical schemas. This means:- An integration built once works for all ATS products (JazzHR, Lever, Jobvite, RM, Flux)
- AI agents can build, test, and deploy new integrations by reasoning about schema contracts via MCP
- The integration surface is standards-aligned (HR Open, Schema.org, O*NET) so it speaks the industry’s language
The Canonical Schema Layer
What It Is
A set of HR-standard-aligned Pydantic v2 models that represent the canonical shape of hiring data. These are NOT internal domain models — they are the external contract that all systems project to and from. Internal Flux models can evolve independently (rename fields, restructure, add domain-specific concepts) as long as the projection adapters keep the canonical shape stable.Canonical Entities
| Entity | Aligned With | Purpose |
|---|---|---|
CanonicalJob | HR Open 4.5 PositionOpening + Schema.org JobPosting | Job postings across all systems |
CanonicalCandidate | HR Open Candidate + Merge ATS Candidate | Applicant profiles |
CanonicalApplication | Merge ATS Application | Links candidate to job pipeline |
CanonicalInterview | HR Open Interview/Assessment | Scheduled interviews + feedback |
CanonicalOffer | HR Open Offer | Compensation offers |
CanonicalEmployee | HR Open Employment + Finch Employee | Post-hire HRIS data |
CanonicalOrganization | Schema.org Organization | Employer entity |
CanonicalSkill | O*NET + ESCO | Skills/competences taxonomy |
CanonicalCompensation | HR Open PositionCompensation + Finch Income | Pay structure |
Open Standards Alignment
| Standard | Org | Format | Usage |
|---|---|---|---|
| HR Open 4.5 | HR Open Standards Consortium | JSON Schema | Reference model for entity structure and field naming. Flux canonical schemas align where practical. |
| Schema.org JobPosting | Schema.org (W3C) | JSON-LD | All Flux jobs emit JSON-LD for Google for Jobs via CanonicalJob.to_schema_org(). Near-universal adoption. |
| O*NET / SOC | US Dept of Labor | REST API + CSV | US occupation codes and skills taxonomy. CanonicalSkill and CanonicalJob.onet_soc_code. |
| ESCO | European Commission | RDF/JSON-LD | EU skills/competences taxonomy. Cross-walked with O*NET for international support. CanonicalJob.esco_uri. |
| Merge Common Models | Merge.dev | OpenAPI 3.0 | Practical industry consensus on ATS data model shape. Used as validation reference. |
| Apideck Unified API | Apideck | OpenAPI (MIT) | Open-source ats.yml, hris.yml specs. Used as reference for integration contract shapes. |
| Finch Unified API | Finch | OpenAPI | Unified payroll/HRIS model covering 220+ systems. Reference for employee/compensation projection. |
Why Not Just Use HR Open Directly?
HR Open 4.5 is comprehensive but verbose — designed for enterprise EDI, not API-first platforms. Our canonical models take the field semantics and naming from HR Open but use a flat, JSON-friendly structure that maps cleanly to REST APIs, Pydantic validation, and frontend consumption. Think of it as “HR Open for the API era.”Projection Architecture
Three Projection Directions
Projection Adapter Contract
Every projection adapter implements bidirectional mapping:source → canonical → source must preserve all mappable data.
Crosswalk Tables
Enum values differ across systems. Crosswalk tables provide deterministic mapping:Integration Contract Registry
The registry tracks all known integration targets — what systems Flux can connect to, what entities are mapped, and the quality of each projection:External API Spec Cache
Vendor OpenAPI specs are cached locally so AI agents can reason about them without live API calls:| System | Source | License | Entities Covered |
|---|---|---|---|
| Kombo | api.kombo.dev/openapi.json | Proprietary (public) | ATS + HRIS |
| Apideck ATS | github.com/apideck-libraries/openapi-specs | MIT | Jobs, candidates, applications |
| Apideck HRIS | Same repo | MIT | Employees, companies, departments |
| Merge ATS | docs.merge.dev | Reference | Candidates, applications, jobs, interviews, offers |
| Finch | developer.tryfinch.com | Reference | Employees, companies, payments, benefits |
make flux-update-external-specs.
AI-Driven Integration Building via MCP
The BFAI Platform (Flux’s AI agent layer) uses MCP tools to discover, build, validate, and deploy integrations:MCP Integration Tools
| Tool | Purpose |
|---|---|
list_canonical_schemas | Returns JSON Schema for all canonical entities — the AI’s starting point for understanding what data is available |
get_external_api_spec | Returns the cached OpenAPI spec for an external system (BambooHR, Gusto, etc.) |
validate_projection | Given a source schema, target schema, and field mapping, validates type compatibility and required field coverage |
generate_projection_adapter | Generates a Python projection adapter class + test suite from a validated field mapping |
Integration Building Flow
Internal Type Safety Pipeline
Before schemas can project outward, they must be enforced inward. The internal pipeline ensures compile-time AND runtime type safety from database to UI:Key Properties
- Single source of truth: Pydantic schemas in
backend/domains/*/schemas.py - Generated, never hand-written:
web/generated/api/contains TypeScript types, Zod schemas, SDK client - Runtime validation: Every API response is Zod-parsed (throws in dev, logs in prod)
- Form validation: Uses the same generated Zod schemas via
zodResolver() - CI enforcement:
oasdiffdetects breaking changes; codegen freshness check blocks stale types - Tool output types: LangChain tool outputs are Pydantic-modeled, flow through the same pipeline to frontend generative UI
Developer Workflow
- Change a Pydantic schema in the backend
- Run
make flux-generate-api-types - Frontend types, Zod schemas, and SDK update automatically
- If you forget, CI fails with “Generated API types are stale”
Execution Strategy
Sequencing
Incremental Expansion
The canonical layer starts with hiring lifecycle entities (Job, Candidate, Interview, Offer) and expands:| Phase | Entities | Driven By |
|---|---|---|
| Cycle 210 | Job, Candidate, Application, Interview, Offer | Core hiring flow |
| Cycle 206 | Candidate portal types (candidate-facing projections) | Candidate portal |
| HRIS cycle | Employee, Organization, Compensation | Paychex/HRIS integrations |
| Distribution cycle | Job distribution, posting analytics | Job board integrations |
| Compliance cycle | Audit trail, consent records | EU AI Act, EEOC |
Decision Record
Why Pydantic as Source of Truth (Not Zod, Not JSON Schema)
- Backend is Python. Pydantic is the natural validation layer.
- FastAPI auto-generates OpenAPI from Pydantic. No manual spec authoring.
@hey-api/openapi-tsgenerates Zod from OpenAPI. Pipeline is fully automated.- Alternative (Zod-first) would require maintaining schemas in two languages or running Node.js in the backend build.
Why @hey-api/openapi-ts Over openapi-typescript
- hey-api generates types + Zod + SDK in one pass. openapi-typescript generates types only.
- hey-api’s Zod plugin produces runtime validators. openapi-typescript is compile-time only.
- hey-api’s SDK plugin replaces hand-maintained query hooks.
- ~977k npm weekly downloads, used by Vercel/PayPal. Production-proven.
Why Zod v4 Over Valibot/ArkType
- Ecosystem dominance: react-hook-form, shadcn/ui, hey-api, TanStack all have first-class Zod support.
- v4 closed the performance gap (6-14x faster than v3, 57% smaller).
@zod/miniat 1.9 KB for bundle-sensitive paths.- Standard Schema compliant — exit path to Valibot/ArkType if ever needed.
Why Canonical Models Separate From Domain Models
- Internal models serve business logic (validation rules, ORM mapping, domain events).
- Canonical models serve integration contracts (field naming, standards alignment, cross-system compatibility).
- They evolve at different rates. A Flux refactor shouldn’t break every integration.
- Projection adapters absorb the difference. Round-trip tests verify integrity.
Why Not a Unified API Platform (Merge/Finch) for All Integrations
- Unified API platforms (Merge, Finch, Kombo) are excellent for rapid integration coverage.
- But they own the data pipeline — adding latency, cost, and a dependency on their uptime.
- Flux’s canonical layer enables BOTH: use Merge/Finch as an integration method (their data projects through canonical) AND build direct integrations for high-value systems.
- The canonical layer is the abstraction. Unified APIs are one implementation strategy behind it.
References
- Cycle 210:
docs/roadmap/cycles/cycle210-e2e-schema-enforcement.md— implementation plan - HR Open Standards 4.5: https://www.hropenstandards.org/standards — JSON Schema downloads
- Schema.org JobPosting: https://schema.org/JobPosting — Google for Jobs structured data
- O*NET: https://www.onetcenter.org/ — US occupation/skills taxonomy
- ESCO: https://esco.ec.europa.eu/ — EU skills/competences taxonomy
- Merge ATS Docs: https://docs.merge.dev/ats/ — ATS common model reference
- Finch API Docs: https://developer.tryfinch.com/ — Unified payroll/HRIS API
- Apideck OpenAPI Specs: https://github.com/apideck-libraries/openapi-specs — MIT-licensed ATS/HRIS specs
- Kombo OpenAPI: https://api.kombo.dev/openapi.json — Downloadable ATS/HRIS spec
- hey-api/openapi-ts: https://heyapi.dev/ — Codegen tool
- Zod v4: https://zod.dev/v4 — Runtime validation
- oasdiff: https://www.oasdiff.com/ — OpenAPI breaking change detection