Xameer's Blog - Designing an EDSL in Haskell: From FFI Glue to a Typed Language

2026-04-18
Haskell, DSL, GADTs, Cardano, Type-Theory, Semantics

Introduction

Most of what gets called “language interop” is really just glue: a Rust crate calls a C function via FFI, a Lua filter rewrites a Pandoc AST node, a Python script shells out to a compiled binary. The glue works. It is also, structurally, the same thing an EDSL does except the EDSL formalizes the boundary, gives it a type, and makes the compiler enforce it.

This is a brainstorm on what it takes to go from writing FFI glue fluently to designing an embedded DSL in Haskell that is genuinely useful rather than a type-level curiosity.

What a Shallow Embedding Actually Is

A shallow embedding is the case where your DSL expressions are Haskell values there is no intermediate representation, no AST, no interpreter. The DSL is a library of composable functions, and the semantics are the denotation: what Haskell value does this expression produce?

-- A trivial shallow embedding for a configuration DSL
newtype Port = Port Int
newtype Namespace = Namespace String

data ServiceConfig = ServiceConfig
  { scNamespace :: Namespace
  , scPort      :: Port
  , scImage     :: String
  }

-- DSL combinators
inNamespace :: Namespace -> ServiceConfig -> ServiceConfig
inNamespace ns cfg = cfg { scNamespace = ns }

onPort :: Port -> ServiceConfig -> ServiceConfig
onPort p cfg = cfg { scPort = p }

The denotation of inNamespace (Namespace "web") defaultService is just a ServiceConfig record. No interpreter needed. This is Conal Elliott’s denotational design in its simplest form: start with the mathematical object your DSL expression means, then write the library around that meaning.

Shallow embeddings are immediately executable because there is nothing to interpret. The cost is that you cannot inspect or optimize the structure of a program you cannot write a pass that eliminates redundant namespace allocations because there is no tree to walk.

When You Need a Deep Embedding

A deep embedding builds an AST first and interprets it later. This is necessary when you need to:

Optimize before executing (eliminate redundant operations)
Compile to a different target (Plutus Core, WASM, systemd unit files)
Reject programs statically that the host type system cannot express

-- Deep embedding: the AST
data ContractExpr a where
  Literal  :: a -> ContractExpr a
  Add      :: ContractExpr Int -> ContractExpr Int -> ContractExpr Int
  IfThenElse :: ContractExpr Bool
             -> ContractExpr a
             -> ContractExpr a
             -> ContractExpr a
  -- Hardware cost annotation  phantom type carries the budget
  WithBudget :: KnownBudget b => Budget b -> ContractExpr a -> ContractExpr a

-- The GADT parameter `a` is the return type of the expression.
-- `WithBudget` carries a phantom `b` that encodes the execution
-- cost at the type level  the compiler rejects programs where
-- budget exceeds the hardware envelope.

The phantom type b in WithBudget is the key move. The type ContractExpr (Validated HardwareA) is a different type from ContractExpr (Validated HardwareB). Code that tries to run a HardwareB-validated contract on HardwareA is a type error, caught at compile time.

This is what “making illegal states unrepresentable” means in practice. Not a runtime check. Not a validator function. A type that does not exist.

The Operational Layer: Rewrite Rules

Once you have an AST, you can write optimization passes as tree rewrites before the final interpreter runs. For the Plutus use case, the interesting rewrites are ones that reduce on-chain execution cost:

-- Optimization pass: constant folding
optimize :: ContractExpr a -> ContractExpr a
optimize (Add (Literal x) (Literal y)) = Literal (x + y)
optimize (IfThenElse (Literal True) t _) = optimize t
optimize (IfThenElse (Literal False) _ f) = optimize f
optimize (Add x y) = Add (optimize x) (optimize y)
optimize (IfThenElse c t f) =
  IfThenElse (optimize c) (optimize t) (optimize f)
optimize expr = expr

The interpreter that runs after optimization maps the AST to Plutus Core terms. The rewrite rules run before that and never see Plutus Core. This separation is the reason to use a deep embedding: you get an optimization surface that a shallow embedding cannot provide.

Combining Both: The Two-Layer Approach

Most production EDSLs use both layers. The user-facing API is shallow composable Haskell functions, no visible AST. Internally, the combinators build a deep representation that is optimized and then interpreted.

\text{User DSL} \xrightarrow{\text{build}} \text{AST} \xrightarrow{\text{optimize}} \text{AST'} \xrightarrow{\text{interpret}} \text{Plutus Core}

The user never sees the AST. The compiler never sees unoptimized code. The separation of concerns is clean.

-- The user-facing combinator builds the AST internally
addFunds :: ContractExpr Int -> ContractExpr Int -> ContractExpr Int
addFunds x y = optimize (Add x y)
-- optimize fires immediately on construction,
-- so the AST is always in normal form

Calling optimize on every construction is eager normalization correct but potentially slow for large programs. For a production EDSL you would defer optimization to a single pass over the completed AST. For a personal project or prototype, eager normalization is fine and easier to reason about.

The Actual Skill Gap

The gap from “I write FFI glue” to “I design EDSLs” is not large technically. You already:

Compose semantics across language boundaries (Haskell/Rust/Lua)
Use types to enforce boundaries (Rust’s ownership, Haskell’s IO separation)
Write interpreters (Pandoc AST transformers are interpreters)

What is missing is the habit of naming the abstraction first. An FFI binding names the foreign function. An EDSL names the domain concept. The discipline is: before writing any code, write down what your DSL expression means as a mathematical object. If you cannot state the denotation clearly, the embedding will be leaky.

For the Cardano use case: a PlutusContract means a function from blockchain state to a set of valid next states, constrained by the hardware budget of the target validator node. Write that down as a type. The rest follows.

What Would Make It Avant-Garde

A Plutus EDSL that is actually useful rather than a type-level exercise would do three things that existing tooling does not:

First, it would carry hardware cost models as first-class types not as documentation or runtime checks, but as type parameters that make over-budget contracts uncompilable on constrained open-source hardware targets.

Second, it would provide a sidechain-aware optimizer that recognizes batched state transition patterns and rewrites them to compact validator scripts automatically, reducing the on-chain execution cost of common DeFi operations.

Third, it would separate the specification layer (what the contract does, in readable Haskell) from the compilation layer (how it maps to Plutus Core), making contracts auditable by domain experts who cannot read Plutus Core.

None of these require new type theory. They require disciplined application of GADTs, phantom types, and rewrite rules to a domain that currently lacks them.

See Introduction

Webmentions

Comments are verified via IndieAuth. You will be redirected to authenticate before your comment is published.

Designing an EDSL in Haskell: From FFI Glue to a Typed Language

Introduction

What a Shallow Embedding Actually Is

When You Need a Deep Embedding

The Operational Layer: Rewrite Rules

Combining Both: The Two-Layer Approach

The Actual Skill Gap

What Would Make It Avant-Garde

Webmentions

Leave a comment