This is an implementation for a small, strongly-typed functional language with parametric polymorphism, user-defined datatypes, pattern matching, and general recursion. The implementation uses no dependencies beyond the OCaml standard library.
Done:
- Definition of the syntax. See
syntax.ml. - Evaluation of terms, including pattern-matching and recursion, and top-level
declarations. See
eval.ml. - Pretty-printing of terms. See
pretty.ml. - Unification. See
unify.ml. - Typechecking. See
typecheck.ml. - Some example programs. See
examples/ski.ev. - External syntax. This is the syntax recognized by the parser.
- Scopechecking to translate external syntax to internal syntax. This pass ensures that all references are valid (constructors, other functions, types, variables, etc.)
- Parsing. The parser is a handwritten recursive descent parser using combinators. The error messages aren't great!
- Bytecode compilation. The bytecode is described in
bytecode.ml, for a two-stack virtual machine (one for locals/parameters and one for the call stack)
Todo:
- tail-call optimization
- optimized pattern-matching compilation (current strategy is linear / naive)
- efficient interpreter implemented in C++ or Rust
- recursive local bindings (only top-level recursive functions work at the moment)
Syntax: definitions of different ASTssynext.mlexternal syntax: this is close to what the user writessynint.mlinternal syntax: what we typechecksynclo.mlclosed syntax: an intermediate representation generated by closure conversion.
Loc: source locationsPretty: pretty-printing of internal syntax, closed syntax, and bytecode instructions.Parser: hand-written parser combinators that generate external syntax from a string.Scopecheck: converts external syntax to internal syntax, disambiguates names (locals vs references), scope-checks all identifiers.TMVar: unification variables, aka "type metavariables" (TMVars); type substitutionsUnify: unification of typesTypecheck: Hindley-Milner type inference using unificationEval: evaluator for internal syntax, turns internal syntax into a value.Close: closure conversion hoists function bodies occurring within functions to the top-level and replaces them with anMkClonode.Bytecode: description of the bytecode instructions emitted by compilation.Compile: compiles closed syntax into bytecode instructions.Interpret: bytecode interpreter.
The main flow of data is: text -> Parser -> Scopecheck -> Typecheck -> Close
Currently, applications of constructors, statically known functions, and primitive operations are automatically eta-expanded, but that eta-expansion introduces new abstractions without appropriately adjusting the indices of bound variables, so it produces bogus code. Manually eta-expand everything as a workaround for now.