Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

C FFI

Milang can call C functions directly by importing a .h header file. The compiler parses the header, extracts function signatures, and maps C types to milang types. At code generation time the header is #included and calls are emitted inline — no wrapper overhead.

Type Mapping

C typeMilang typeC codegen type
intInt' 32int
long, int64_tInt' 64int64_t
short, int16_tInt' 16int16_t
int8_t, charInt' 8int8_t
ssize_t, ptrdiff_tInt' 64int64_t
unsigned int, uint32_tUInt' 32unsigned int
unsigned long, uint64_t, size_tUInt' 64uint64_t
unsigned short, uint16_tUInt' 16uint16_t
uint8_t, unsigned charByteuint8_t
doubleFloatdouble
floatFloat' 32float
char*Strchar*
void*, opaque pointersOpaque handlevoid*
Nothing (milang value)NULL pointerNULL
void returnInt (0)
typedef struct { ... } NameRecordstruct
typedef enum { ... } NameInt constantsint64_t
typedef ret (*Name)(params)Callbackfunction pointer
#define NAME valueInt constant

The compiler generates :: type annotations for all imported C functions using sized types. For example, a C function int add(int a, int b) gets annotated as add :: Int' 32 : Int' 32 : Int' 32, while double sin(double x) gets sin :: Float : Float. These annotations are visible with milang dump.

Importing C Headers

Import a system header the same way you import a .mi file:

m = import "math.h"

result = m.sin 1.0
root = m.sqrt 144.0

The result is a record whose fields are the C functions declared in the header. Use dot notation to call them.

Selective Import with import'

If you only need a few functions, or need to attach compilation options, use the import' form:

m = import' "math.h" ({})
result = m.cos 0.0

Associating C Source Files

For your own C libraries, tell the compiler which source files to compile alongside the generated code:

lib = import' "mylib.h" ({src = "mylib.c"})
answer = lib.add_ints 3 4

The src field takes a single source file path (relative to the importing .mi file).

Advanced Options

The options record passed to import' supports several fields:

FieldTypeDescription
srcStrSingle C source file to compile
sourcesListMultiple source files: ["a.c", "b.c"]
flagsStrAdditional compiler flags (e.g. "-O2 -Wall")
cc_flagsStrFlags passed only to gcc, not to the preprocessor (e.g. "-DIMPL")
includeStrAdditional include directory
filterListSelective import: only import the named functions (e.g. ["SDL_Init", "SDL_Quit"])
pkgStrpkg-config package name — auto-discovers flags and includes
annotateFunctionAnnotation function for struct/out/opaque declarations (see FFI Annotations)

Example with multiple options:

lib = import' "mylib.h" ({
  sources = ["mylib.c", "helpers.c"]
  flags = "-O2"
  include = "vendor/include"
})

Using pkg-config for a system library:

json = import' "json-c/json.h" ({pkg = "json-c"})

Selective Function Imports

When importing large system headers (e.g., <SDL2/SDL.h>), the compiler generates bindings for every visible function — often thousands from transitive includes. Use filter to import only the functions you need:

sdl = import' "<SDL2/SDL.h>" ({
  filter = ["SDL_Init", "SDL_CreateWindow", "SDL_DestroyWindow", "SDL_Quit"]
  flags = "-lSDL2"
  standard_import = 1
})

sdl.SDL_Init sdl.SDL_INIT_VIDEO

Only the named functions, enums, and constants are imported; everything else is filtered out.

Compiler-Only Flags (cc_flags)

The cc_flags option passes flags to gcc during compilation but not to the preprocessor used for header parsing. This enables STB-style single-header libraries:

// mylib.h
#ifndef MYLIB_H
#define MYLIB_H
int my_func(int x);     /* milang sees this */

#ifdef MYLIB_IMPL
int my_func(int x) { return x + 1; }  /* hidden from milang's parser */
#endif
#endif
lib = import' "mylib.h" ({
  cc_flags = "-DMYLIB_IMPL"
})

Without cc_flags, the implementation section stays hidden from milang’s parser (correct), but gcc also doesn’t see it, requiring a separate .c trigger file. With cc_flags, gcc gets -DMYLIB_IMPL so it compiles the implementation, while milang only sees the declarations.

How It Works

  1. The import resolver reads the .h file and extracts function declarations, struct definitions, enum constants, #define integer constants, and function pointer typedefs.
  2. Each C function becomes an internal CFunction AST node with its milang type signature. Integer types preserve their bit width (e.g., int → 32-bit, int64_t → 64-bit).
  3. Struct and enum type names are resolved so they can be used as parameter and return types.
  4. Enum constants and #define integer constants become Int bindings on the module record.
  5. If an annotate function is provided, it is called with a compiler-provided ffi object and the namespace. The function returns descriptors that generate struct constructors, out-parameter wrappers, or opaque type accessors.
  6. During C code generation the header is #included and calls are emitted as direct C function calls. Duplicate #include directives are automatically deduplicated.
  7. Any associated source files are compiled and linked automatically.

Structs by Value

C structs defined with typedef struct or struct Name are automatically mapped to milang records. Fields are accessible by name:

// vec.h
typedef struct { double x; double y; } Vec2;
Vec2 vec2_add(Vec2 a, Vec2 b);
double vec2_dot(Vec2 a, Vec2 b);
v = import' "vec.h" ({src = "vec.c"})

a = {x = 1.0; y = 2.0}
b = {x = 3.0; y = 4.0}
result = v.vec2_add a b    -- {x = 4.0, y = 6.0}
dot = v.vec2_dot a b       -- 11.0

Records passed to struct-taking functions are converted to C structs using C99 compound literals. Struct return values are converted to milang records with the same field names.

Enum Constants

C enum definitions in headers are exposed as Int constants on the module record:

// color.h
typedef enum { RED = 0, GREEN = 1, BLUE = 2 } Color;
int color_value(Color c);
c = import' "color.h" ({src = "color.c"})

c.RED                    -- 0
c.GREEN                  -- 1
c.color_value c.BLUE     -- uses enum constant as argument

Both typedef enum { ... } Name; and enum Name { ... }; are supported. Auto-incrementing values work as in C.

#define Constants

Integer #define constants in headers are extracted and exposed on the module record:

// limits.h
#define MAX_SIZE 1024
#define FLAG_A 0x01
#define FLAG_B 0x02
lib = import "limits.h"
lib.MAX_SIZE     -- 1024
lib.FLAG_A       -- 1

Both decimal and hexadecimal integer constants are supported. Non-integer #defines (macros, strings, expressions) are silently skipped.

FFI Annotations

For C libraries where the compiler needs more information than the header alone provides — struct constructors, out-parameters, or opaque type accessors — use the annotate option. The annotation function receives a compiler-provided ffi object and the imported namespace:

lib = import' "point.h" ({
  src = "point.c"
  annotate = ann
})

ann ffi ns = values =>
  ffi.struct "Point" |> ffi.field "x" "int32" |> ffi.field "y" "int32"

Struct Annotations

ffi.struct declares a C struct type and its fields. This generates:

  1. A constructor function (make_Name) that creates a milang record from arguments
  2. Automatic type patching — C functions using opaque pointers (void*) to this struct type get rewritten to accept/return milang records with proper struct layout
ann ffi ns = values =>
  ffi.struct "Point" |> ffi.field "x" "int32" |> ffi.field "y" "int32"

After annotation, lib.make_Point 10 20 creates a Point struct value, and functions like lib.point_sum that take Point* parameters accept records directly.

Available field types: "int8", "int16", "int32", "int64", "uint8", "uint16", "uint32", "uint64", "float32", "float64", "string", "ptr".

Out-Parameter Annotations

C functions that return values through pointer parameters can be annotated so they return a record of results instead:

ann ffi ns = values =>
  ffi.out "point_components" |> ffi.param 1 "int32" |> ffi.param 2 "int32"

This transforms point_components(point, &out1, &out2) into a function that returns {out1 = ..., out2 = ...}. Parameter indices are 0-based positions in the C function signature.

For out-parameters that are opaque pointers (e.g., union types or structs you don’t want to map field-by-field), use "ptr:TypeName" as the ctype:

ann ffi ns = values =>
  ffi.out "poll_event" |> ffi.param 0 "ptr:SDL_Event"

This allocates sizeof(SDL_Event) on the heap, passes the pointer directly (not &), and returns it as an opaque handle. You can then use ffi.opaque with ffi.accessor to read fields from the returned pointer.

Nullable Pointers (Nothing → NULL)

Any C function parameter typed as a pointer (void*, struct*, etc.) accepts Nothing as a milang value, which is passed as NULL to C. This is useful for optional parameters:

-- SDL_RenderCopy(renderer, texture, srcrect, dstrect)
-- Pass Nothing for srcrect and dstrect to use the full texture
lib.SDL_RenderCopy renderer texture Nothing Nothing

Non-Nothing values are passed as normal pointers. This works for all pointer parameters automatically — no annotation is needed.

Opaque Type Annotations

For opaque struct types (where you can’t or don’t want to map the full struct layout), use ffi.opaque with ffi.accessor to generate field accessor functions:

ann ffi ns = values =>
  ffi.opaque "Event"
    |> ffi.accessor "type" "int32"
    |> ffi.accessor "detail.code" "int32"

This generates accessor functions on the module: lib.Event_type event and lib.Event_detail_code event. Dot-separated paths (like detail.code) access nested struct fields. The accessor functions are compiled to inline C that casts the opaque pointer and reads the field directly.

When the opaque type is defined in a separate header from the API header being imported, use ffi.include to specify the type definition header:

ann ffi ns = values =>
  ffi.opaque "SDL_Event"
    |> ffi.include "<SDL2/SDL_events.h>"
    |> ffi.accessor "type" "uint32"

System includes (angle brackets) and quoted includes are both supported. The include is added to the generated C code so the type is visible for the accessor cast.

Combining Annotations

Multiple annotations can be declared in a single values => block:

ann ffi ns = values =>
  ffi.struct "Point" |> ffi.field "x" "int32" |> ffi.field "y" "int32"
  ffi.out "decompose" |> ffi.param 1 "int32" |> ffi.param 2 "int32"
  ffi.opaque "Handle" |> ffi.accessor "id" "int64"

Callbacks (Function Pointers)

Milang functions can be passed to C functions that expect function pointers. Define the callback type with typedef:

// callback.h
typedef long (*IntFn)(long);
long apply_fn(IntFn f, long x);
long apply_twice(IntFn f, long x);
cb = import' "callback.h" ({src = "callback.c"})

cb.apply_fn (\x -> x * 2) 21        -- 42
cb.apply_twice (\x -> x + 1) 0      -- 2

-- Named functions work too
square x = x * x
cb.apply_fn square 7                 -- 49

The compiler generates a trampoline that converts between C calling conventions and milang’s closure-based evaluation. Multi-parameter callbacks are supported:

typedef long (*BinFn)(long, long);
long fold_range(BinFn f, long init, long n);
add_fn acc i = acc + i
cb.fold_range add_fn 0 10    -- sum of 0..9 = 45

Callbacks are pinned as GC roots, so they remain valid even if the C library stores and calls them later (e.g., event handlers in GUI frameworks).

Security Considerations

C code bypasses milang’s capability model — a C function can perform arbitrary IO, allocate memory, or call system APIs regardless of what capabilities were passed to the milang caller. Use the following flags to restrict FFI access:

  • --no-ffi — disallow all C header imports. Any import "*.h" will fail.
  • --no-remote-ffi — allow local .mi files to use C FFI, but prevent URL-imported modules from importing C headers. This stops remote code from escaping the capability sandbox through native calls.

These flags are especially important when running untrusted or third-party milang code.

Memory Management for FFI Pointers

By default, pointers returned from C functions are unmanaged — they become MI_POINTER values that are never freed. For short-lived programs this is fine, but long-running programs will leak memory.

Automatic cleanup with gc_manage

Use the gc_manage builtin to associate a pointer with a finalizer function. The garbage collector will automatically call the finalizer when the value becomes unreachable:

ffi = import' "mylib.h" ({src = "mylib.c"})

-- Wrap the pointer with its free function
obj = gc_manage (ffi.myobj_create 42) ffi.myobj_free

-- Use normally — FFI functions accept managed pointers transparently
val = ffi.myobj_read obj

-- No manual free needed! The GC handles cleanup.

gc_manage takes two arguments:

  1. A pointer value (from an FFI allocation function)
  2. A native function (the FFI free/destructor function)

It returns an MI_MANAGED value that behaves identically to a regular pointer in FFI calls — all existing FFI functions work without modification.

When to use gc_manage

  • Use it for objects that your code allocates and should own: arrays, buffers, file handles, database connections.
  • Don’t use it for pointers returned by C functions that manage their own lifetime (e.g., stdin, shared library handles).

C-level registration

FFI implementors who prefer to register finalizers in C can use mi_managed() directly:

// In your FFI .c file — declare the runtime function
extern MiVal mi_managed(void *ptr, void (*finalizer)(void*));

MyObj* myobj_create(long val) {
    MyObj *obj = malloc(sizeof(MyObj));
    obj->value = val;
    // Register with GC — milang code gets an MI_MANAGED value automatically
    return obj;  // still returns raw pointer; use gc_manage from milang instead
}

Note: When using C-level mi_managed(), the FFI wrapper function should return MiVal directly rather than a raw pointer. In most cases, using gc_manage from milang code is simpler.

How the GC works

Milang uses a mark-sweep garbage collector for runtime-allocated environments (MiEnv) and managed pointers:

  • Init-time allocations (prelude setup, AST construction) use a bump-allocated arena and are never freed.
  • Eval-time allocations (created during program execution) use a malloc-based pool with a free list.
  • The GC runs automatically every 100K environment allocations.
  • During the mark phase, the GC traces all reachable values from the current environment root, including closures, managed pointers, and pinned callback closures.
  • During the sweep phase, unreachable environments are returned to the pool, and unreachable managed pointers have their finalizers called.

For tail-recursive programs, memory stays bounded — the GC reclaims environments from completed iterations.