Design Systems: Tokens First

Most design systems fail at the same place — they start with components. The fix is one CSS file written before any component exists, and a discipline most teams give up on by week three. Here is the playbook.

Published
18 March 2026
Updated
16 May 2026
Read time
8 min
Words
1,406
Tags
design-systems · css · craft
Design Systems: Tokens First — cover
Design Systems: Tokens FirstEngineering

Most design systems fail at the same place: they start with components.

Someone opens Figma, designs a button. Someone else opens the editor, codes the button. Six weeks later there are forty-seven buttons across twelve components and nobody can explain which spacing token a card padding uses, or whether the navigation hover colour is the same as the link hover colour, or why the error state on one form looks different from the error state on the one next to it.

The fix is not "write a style guide." The fix is structural. Start with tokens. Ship tokens before any component exists. The reason this works is the same reason it is so often skipped — it does not feel like progress in week one.

What tokens-first actually means

Tokens are the atoms. Colours, spacing values, typography scales, motion durations, border widths. They have no visual identity of their own — they are just named values.

Tokens-first means your first commit on a new project is a tokens.css file. Before the layout shell. Before the navigation. Before page one. You define the entire visual vocabulary, and every subsequent line of CSS or JSX has to use a token or justify why not.

The constraint that feels slow in week one is what keeps the design coherent in month six.

TYPES — interface hierarchy diagram: INTERFACE branching to STRING and NUMBER, then to EMAIL, ID, COUNT, PRICE
A design token system is a type system for the visual vocabulary. Same logic, same payoff.

Here is what my tokens.css opens with on a new project:

:root {
  /* Surface */
  --bg:    #0A0A0B;
  --bg-2:  #0F0F10;
  --bg-3:  #121215;
 
  /* Foreground */
  --fg:    #F5F5F3;
  --fg-2:  #D4D4D2;
  --fg-3:  #9A9A98;
 
  /* Rules */
  --rule:        rgba(245, 245, 243, 0.08);
  --rule-strong: rgba(245, 245, 243, 0.18);
 
  /* Type scale */
  --t-h1:    clamp(2.5rem, 6vw, 6.25rem);
  --t-body:  1rem;
  --t-micro: 0.6875rem;
 
  /* Spacing */
  --sec:   clamp(3.5rem, 8vw, 7.5rem);
  --pad:   clamp(1.25rem, 4vw, 3rem);
  --stack: clamp(1rem, 2vw, 2rem);
}

There is nothing visual yet. But every decision about what a card looks like, what a button does on hover, what padding a section gets — is constrained to values that already exist.

Why this compounds

Three compounding benefits, in order of how soon you feel them.

1 · Design-dev sync stops being a problem. When the Figma file and the CSS file both consume the same named tokens, there is no "which padding did the designer use here?" The designer used --pad. The engineer uses var(--pad). The handoff is structural, not interpretive.

2 · Rebrands become fixable in an afternoon. I rebuilt this site (nkovalcin.com v3) four times during P0 before settling on the current direction. Each rebuild was a tokens-file edit and a component reflow. No scraping through 40 components to find hardcoded colours. The site is the tokens file, amplified.

3 · Dark mode, light mode, accessibility bumps are trivial. If every value is a token, flipping to light mode is rewriting a single block. Boosting contrast for AA compliance is bumping three values. When colours are hardcoded, each of those becomes a weekend.

TIDY — three horizontal bands with scattered marks in dark band and aligned marks in lavender band, illustrating mess vs order
Discipline visualized. The top band is what happens without enforcement; the bottom band is what a Stylelint rule produces in week three.

The discipline problem

Tokens-first fails for the same reason most engineering practices fail: nobody enforces it after week three.

The temptation, when you are staring at a button on a Wednesday afternoon trying to ship a feature by Friday, is to write padding: 12px. It is faster. It works. Nobody is going to review it.

Three months later you have 800 instances of padding: 12px across the codebase, and the design team wants to bump the base spacing rhythm to 14px. This is how design debt accumulates — one pragmatic shortcut at a time, none of which seemed important on their own.

Two tactics that work:

Write a lint rule. Stylelint has a declaration-property-value-allowed-list that will flag any raw pixel or hex value. The noise is painful for the first week. After that, it catches everything. The lint rule is the discipline.

Write tokens TOO aggressively. If you find yourself about to use padding: 12px, first ask: should there be a --pad-sm token for this? Usually the answer is yes. Add it, use it. Over time your token file grows to cover the actual vocabulary your design uses — and the lint rule gets less noisy.

The second tactic is the one most teams skip. They write a lint rule, fight the lint rule for two weeks, then disable it. The fix is not disabling the rule — it is admitting that your tokens file is incomplete and growing it.

What not to tokenize

Tokens are for the vocabulary. They are not for the grammar.

Do not tokenize layout specifics. Grid templates, flexbox patterns, aspect ratios — these are per-component decisions, not shared vocabulary. A card component's internal grid is not a token; the gap between grid children might be (--gut).

Do not tokenize component-specific magic numbers. The 6px live-dot radius does not need a --livedot-size token. Just put it in the component CSS. Tokens are for things that recur across ≥3 components.

Do not pre-build generic scales you will not use. Do not make --fg-1 through --fg-9. You will use three of them. Make --fg, --fg-2, --fg-3, and --fg-4, and stop. Every token you define is a decision that has to be maintained — and most aesthetic scales have 4–5 meaningful stops, not nine.

If you are starting a project and want help structuring your tokens before you ship a single component, that is part of how I run discovery sprints.

PROCESS — engineering schematic with Ø 05 INTAKE, ± 0.1 REVIEW, ⌀ 12 OUTPUT boxes connected by arrows
The three-day kick-off in schematic form. Intake the colour/type/spacing decisions, review them in browser, output a token file. Tight tolerances, not vibes.

The kick-off ritual

When I start a new engagement, the first three-day block is always:

Day 1 — Colour tokens (surface + foreground + rules + signals). Ship a single page that renders big blocks of each colour with its name. Look at the actual values in the actual UI chrome. Adjust until it reads right. The browser is the source of truth, not the Figma file.

Day 2 — Type tokens + scale. Ship a type-scale demo page. Size classes from H1 through mono-small, each labeled. Check in browser, on mobile, on the actual hardware the target users will use. Adjust.

Day 3 — Spacing + motion. Ship a section-rhythm page with var(--sec) variation between sections. Check vertical rhythm on a long scroll. Adjust.

Only after day three do components start.

The one-week rule

If your tokens file cannot survive a week of feature work without somebody wanting to hardcode a value, the tokens file is wrong.

Go back. Look at what got hardcoded. It is either something that should be a token (and is not), or a component-local value (and should never have tempted tokenization).

A good tokens file is small. Maybe 30–50 named values, total. If you are at 200 tokens and growing, you have conflated vocabulary with grammar and the lint rules will fight you forever.

Keep it small. Keep it opinionated. Keep it first.

Takeaways — what to ship this week

  • Open a new tokens.css before opening anything else. Even on a project that is half-built. Refactor toward it.
  • Cap the token count. 30–50 named values. If you are above that, audit what should not be a token at all.
  • Enable the Stylelint rule on day one. Fight it for a week. After that it is invisible.
  • Build a tokens demo page. Three pages, actually — one each for colour, typography, and spacing. Use these as the source of truth for design reviews.
  • Refuse hardcoded values in code review. Every pixel value or hex code that lands in your repo without a token reference is technical debt. Catch it at PR time, not in month six.
  • Re-audit tokens every quarter. Delete the ones nobody uses. Promote the recurring hardcoded values to new tokens. The file is alive, not frozen.

The discipline is small but unforgiving. A new project with 30 carefully chosen tokens, enforced by a lint rule, ships faster and prettier than the same project with components written against vibes. The tokens file is your design system, amplified.


Related: Building Modern Web Applications in 2026 · Owning Your Stack in 2026 · How I Run Discovery Sprints

Share this essayPost on XShare on LinkedIn
Norbert Kovalčín
Written by Norbert KovalčínIndependent architect · Europe · CETI help companies own their stack instead of renting it. One client at a time.
Enjoyed this?

New essay every few weeks.

Subscribe for the next one. Double opt-in, unsubscribe in one click, no tracking pixels.