paper: done!

4621699e · Mark Cohen · 3d4b5276 · 4621699e · 4621699e
Commit 4621699e authored Jun 06, 2019 by Mark Cohen
Showing with 498 additions and 163 deletions
paper/main.bib
paper/main.tex
--- a/paper/main.bib
+++ b/paper/main.bib
@@ -17,3 +17,25 @@
  title = "Traits: The Formal Model",
  year = "2003"
 }
+@article{cardelli,
+  author = "Luca Cardelli",
+  title = "Basic Polymorphic Typechecking",
+  journal = "Science of Computer Programming",
+  volume = "8",
+  issue = "2",
+  year = "1988"
+}
+@article{damas-milner,
+  author = "Luis Damas and Robin Milner",
+  title = "Principal type-schemes for functional programs",
+  journal = "PoPL",
+  year = "1982"
+}
+@article{shunting,
+  author = "Edsger W. Dijkstra",
+  title = "Making a Translator for ALGOL-60",
+  year = "1961"
+}
--- a/paper/main.tex
+++ b/paper/main.tex
-\documentclass[11pt]{article}
+\documentclass[11pt, twocolumn]{article}
 \usepackage{amssymb, amsmath, amsthm, amsrefs}
-\usepackage{parskip, setspace, fullpage}
+\usepackage{setspace, fullpage}
 \usepackage[margin = 0.75in]{geometry}
 \onehalfspacing
-\setlength\parindent{0pt}
+\setlength\parindent{12pt}
 \usepackage[english = american]{csquotes}
 \MakeOuterQuote{"}
@@ -37,13 +37,12 @@
 \renewcommand{\tt}{\texttt}
-\usepackage{multicol}
 \setlength\columnsep{33pt}
 \usepackage{hyperref}
-\title{Iterative Programming in ML with Type Classes and Associated Types}
+\title{ForML: Iterative Programming with Type Classes and Associated Types}
 \author{
  Mark Cohen\\
  Department of Computer Science\\
@@ -54,37 +53,59 @@
 \begin{document}
 \maketitle
 \begin{abstract}
 Iterative programming is simply the best way to work with collections in terms of being powerful,
-highly intuitive, and imposing a low cognitive load. We present a design for pure polymorphic
+highly intuitive, and imposing a low cognitive load. Thinking in terms of \tt{map}, \tt{filter}, and
-iteration in the ML discipline. We base this design in a language modeled after ML, extended with
+\tt{fold} can certainly be just as intuitive, but only after substantially more training; especially
-type classes and associated types. This paper presents the motivation, design, intuition, and
+for introductory-level programmers, iteration is more developer-friendly. We present a design for
-formalism behind this language. Advised by Adam Shaw.
+pure polymorphic iteration in the ML discipline. We base this design in a language modeled after ML,
+extended with type classes and associated types. This paper presents the motivation, design,
+intuition, and formalism behind this language. Advised by Adam Shaw.
 \end{abstract}
-\begin{multicols}{2}
 \section{Introduction}
-Iterative programming is second-to-none in terms of being powerful, highly intuitive, and having a
+Collections are the basis of software development, but despite this fact, they are often rather
-low cognitive load. Other disciplines of programming with collections come with substantial
+painful to work with. Consider the following Python program:
-tradeoffs. Using purely functional \tt{map}, \tt{filter}, and \tt{fold} is substantially less
-powerful, as there's no mechanism for reusing code across different types of collections. And using
-monadic sequencing operators is highly powerful and has a low cognitive load once the programmer
-clears the bar for understanding this discipline; unfortunately, that bar is very high.
-Consider the following Python program:
 \begin{lstlisting}
 for (k, v) in db:
    if query(k):
        res.append((k, v))
 return dict(res)
 \end{lstlisting}
-This is highly intuitive: the gap between one's mental model and the actual program is minimal. This
+This is highly intuitive: the gap between the programmer's mental model and the actual program is
-particular snippet iterates over all key-value pairs in a dictionary, performs some query on each
+minimal. This particular snippet iterates over all key-value pairs in a dictionary, performs some
-one, and appends all keys that pass the query to some response list. Then, it converts that response
+query on each one, and appends all keys that pass the query to some response list. Then, it converts
-list to a dictionary with blank values.
+that response list to a dictionary with blank values.
+Python requires minimal underlying infrastructure to enable overloading: all that is required of the
+programmer is that \tt{db} have the \verb|__iter__| method implemented.
+Unfortunately, there are several pain points that Python can produce with this program. First, what
+if \tt{db} is not something that can be iterated over? Since Python does not perform static
+type checking, that will cause a run-time error. Similarly, run-time errors will be caused if \tt{res}
+is not something that can be appended onto or converted into a dictionary. Especially pernicious are
+run-time errors arising in the last case, when converting between collections.
+Rust is a relatively young language that brings strong, static type checking and modern type theory
+to systems programming. Based on traits \cite{traits}, Rust can produce a similar level of elegance
+to Python, but with the backing of a strong type system. Consider the following Rust translation of
+the above Python example:
+\begin{lstlisting}
+for (k, v) in db.iter() {
+    if query(k) {
+        res.push((k, v));
+    }
+};
+res.collect::<HashMap<K, V>>()
+\end{lstlisting}
+This program reads roughly the same as the equivalent program in Python. However, behind the scenes
+is an \tt{Iterator} trait that provides the infrastructure necessary to verify the type of this
+program. Additionally, rather calling overloaded collection constructors, Rust opts for an
+explicitly annotated \tt{collect} method, itself leveraging traits for the sake of type safety.
-Contrast the above with the following equivalent Standard ML (SML) program:
+Contrast the above two examples with the following equivalent Standard ML (SML) program:
 \begin{lstlisting}
 let
  fun arrToDict' d [] = d
@@ -95,31 +116,27 @@ in
  arrToDict (filter (fn (k, _) => query k) d)
 end
 \end{lstlisting}
-This is, in contrast, highly unintuitive: it requires one to think inside-out. To be sure, this
+In the land of ML, we still have the benefit of static type checking. However, there is no
-discipline of programming has many benefits, chiefly that it is purely functional. An additional
+infrastructure for reusing code; the bulk of the example above is actually concerned with how to
-benefit is that languages in this discipline tend to be type-sound, where languages in the iterative
+convert between lists and dictionaries. Additionally, as previously discussed, thinking in terms of
-discipline tend to be either untyped (Python, Javascript, etc) or weakly typed (C, Java, etc). These
+\tt{map}, \tt{filter}, and \tt{fold} often comes with a higher cognitive load than thinking in terms
-tendencies are purely correlative, but nonetheless represent real-world constraints.
+of iteration, especially for new programmers.
-One notable exception to this rule is Rust, which provides iteration based on traits ((TODO: cite
-Scharli, also maybe some Rust papers?)). Below is a Rust program equivalent to the two discussed
-above:
-\begin{lstlisting}
-for (k, v) in db.iter() {
-    if query(k) {
-        res.push(v);
-    }
-};
-res.collect::<HashMap<K, V>>()
-\end{lstlisting}
+We seek to develop a discipline of iteration in ML by combining the positive aspects of the
+above-discussed disciplines, while avoiding the downsides.
-\section{Design}
-To enable iteration, we propose a structure built around complementary pairs of type classes and
-three keyword constructs.
-\end{multicols}
+\section{Design}
-\begin{center} \begin{tabular}{|c|} \hline \begin{lstlisting}[mathescape=true]
+To enable iteration, we propose a structure built around a complementary pair of type classes and
+four keyword constructs. All of this lives at the surface level: the programmer is free to implement
+instances of these type classes, and to use the keyword constructs as specified. Before diving deep
+into the definitions, we will lay out the infrastructure and prior research on top of which these
+constructs are built.
+\begin{figure*}[ht]
+\begin{center}
+\begin{tabular}{c}
+\begin{lstlisting}[mathescape=true]
 class Iterable $\alpha$ where
    type item
    val next: $\alpha$ $\arrow$ (item * $\alpha$) option
@@ -135,11 +152,53 @@ for <var> in <collection> $\arrow$ <ty>
 collect <expr>
 pass
-\end{lstlisting} \\\hline \end{tabular} \end{center}
-\begin{multicols*}{2}
-\subsection{The \tt{Iterable} typeclass}
+bail
-The \tt{Iterable} typeclass, unsurprisingly, marks a type as iterable. The class contains an
+\end{lstlisting}
+\end{tabular}
+\end{center}
+\caption{Design for iteration constructs}
+\label{design}
+\end{figure*}
+\subsection{Infrastructure}
+Present in this design are two well-developed concepts not present in ML: type classes and
+associated types.
+Type classes, originally presented by Wadler \& Blott \cite{typeclasses}, allow clean encapsulation
+of behaviors of individual collections. Chiefly, the presence of type classes in our language
+obviates the combinatorial explosion of "convert X to Y" functions that would otherwise be necessary
+in ML.
+Traits were the logical first research direction, as Rust is the closest existing language to our
+proposed design. However, upon further investigation, the original presentation of traits by
+Sch\"arli et al. \cite{traits} was unsuitable as a foundation for this research; Sch\"arli et al.
+developed traits as subordinate to object-oriented classes, for the purpose of sharing signatures
+and methods between closely related but non-hierarchical class structures. This means that, for
+example, traits cannot imply one another like type classes can (think of the type signature for pair
+equality in Haskell, \tt{Eq a, Eq b => Eq a * b}). Another limitation this imposes is that the
+programmer must own a type to implement a trait for it.
+Associated types, as originally presented by Chakravarty et al. \cite{associated-types}, allow for
+linking types together. For the purposes of this design, we only require associated types of kind
+\tt{*}: the infrastructure presented by Chakravarty et al. is substantially more powerful, but only
+the simplest case of it is necessary here.
+For the sake of consistency, we will embed the results of Chakravarty et al. \cite{associated-types}
+in the notation of Wadler \& Blott \cite{typeclasses}. An associated type, intuitively speaking,
+will be treated as an "overloaded" type: in the notation of Wadler \& Blott \cite{typeclasses}, we
+would write $\tt{item} ::_i \alpha \tt{ list} \bs{\alpha}$ to signify that the associated type
+\tt{item} can be instantiated for $\alpha \tt{ list}\text{s}$, translating to $\alpha$.
+We would like to operate as much as possible at the surface level; accordingly, after discussing the
+above design, we will show its translation to the intermediate language of Wadler \& Blott
+\cite{typeclasses}, at which point we will allow their system to take over the remaining translation
+down to ML.
+With these definitions in mind, we can now dissect the design proposed in figure \ref{design}.
+\subsection{The \tt{Iterable} type class}
+The \tt{Iterable} type class, unsurprisingly, marks a type as iterable. The class contains an
 associated type called \tt{item}, which corresponds to the element type of the collection. For an
 $\alpha$ \tt{list}, \tt{item} is $\alpha$. For a \tt{string}, \tt{item} is \tt{char}.
@@ -149,8 +208,8 @@ method encodes both loop variable bindings and loop bounds: at each iteration, t
 bound to the first half of the pair returned by \tt{next}. When \tt{next} returns \tt{None}, we
 conclude execution.
-\subsection{The \tt{Collectible} typeclass}
+\subsection{The \tt{Collectible} type class}
-The \tt{Collectible} typeclass is something a little different: it marks a type as able to be built
+The \tt{Collectible} type class is something a little different: it marks a type as able to be built
 up piece-by-piece. Like \tt{Iterable}, the class contains an associated type \tt{item}, which
 functions exactly the same as in \tt{Iterable}.
@@ -166,159 +225,413 @@ the elements of the collection argument and the item argument.
 \subsection{The \tt{for} keyword construct}
 The \tt{for} keyword takes an identifier \tt{<var>}, which represents the loop variable. \tt{<var>}
 will be bound to each element of \tt{<collection>} in succession before each evaluation of
-\tt{<expr>}. After performing inference the body of \tt{<expr>}, \tt{<var>} must have a type that
+\tt{<expr>}. After performing inference on the body of \tt{<expr>}, \tt{<var>} must have a type that
 unifies with \tt{<collection>}'s \tt{item} type.
 The notation $\arrow \tt{<ty>}$ signifies that the programmer wishes to \emph{build} a collection of
 type \tt{<ty>}: all loops in this discipline must explicitly build a collection.
 This may seem restrictive, however given that the programmer can implement the \tt{Collectible}
-typeclass for things that are not traditionally considered "collections", it does not impose undue
+type class for types that are not traditionally considered "collections", it does not impose undue
 restriction.
 There are important type restrictions on the various forms in \tt{for}: \tt{<collection>} must be an
-\tt{Iterable}, and \tt{<ty>} must be a \tt{Collectible}. For the moment, we will not make any claims
+\tt{Iterable}, and \tt{<ty>} must be a \tt{Collectible}. Lastly, intuitively, \tt{<expr>} must unify
-about the type of \tt{<expr>}. In the end, we will require a special type of \tt{<expr>}, but the
+with \tt{<ty>}'s \tt{item} type (modulo a special "glue" type that we will soon discuss): that is,
-intuitive type restrictions on items being collected will actually be imposed by the \tt{collect}
+\tt{<expr>} must be something that can be \tt{insert}ed into an internal accumulator of type
-keyword.
+\tt{<ty>}.
-\subsection{The \tt{collect} and \tt{pass} keyword constructs}
+\subsection{The \tt{collect}, \tt{pass}, and \tt{bail} keyword constructs}
 The \tt{collect} keyword takes only an expression \tt{<expr>}, which implicitly is assumed to
 contain references to the loop variable \tt{<var>}; though of course there are situations where it
-may not. Here is where we impose the restriction that \tt{<expr>} must be of the same type as
+may not. \tt{collect}, intuitively, signals to add the following \tt{<expr>} to the collection being
-\tt{<ty>}'s \tt{item} type: that is, \tt{<expr>} must be something that can be \tt{insert}ed into
+built.
-an internal accumulator of type \tt{<ty>}.
 The \tt{pass} keyword's sole purpose is to make \tt{filter} expressions well-founded. Consider the
-following program to filter out all zero elements in a list, which would not be feasible to write
+following program to filter out all zero elements in a list, which would be infeasible without
-without \tt{pass}:
+\tt{pass}:
 \begin{lstlisting}[mathescape=true]
 for x in [1, 2, 0, 3] $\arrow$ int list
  if zero? x
  then pass
  else collect x
 \end{lstlisting}
-\tt{pass} imposes no type restrictions and is defined to unify with any \tt{collect} expression.
-\section{Translation}
-% uncomment the below line to show all citations, even those not explicitly cited
-\nocite{*}
-\bibliographystyle{unsrt}
-\bibliography{main}
-\end{multicols*}
+The \tt{bail} keyword, like \tt{pass}, is nothing more than a placeholder. Its responsibility is to
+make expressions that "bail" early (i.e. terminate iteration when a certain condition is reached)
+well-founded. Consider the following program to gobble tokens until a closing parenthesis is
+reached, which would not be feasible to write without \tt{bail}:
+\begin{lstlisting}[mathescape=true]
+val toks = [Num 4, Plus, Num 3, RParen, Times, Num 2]
+for tok in toks -> token list
+  if tok = RParen
+  then bail
+  else collect tok
+\end{lstlisting}
-\clearpage
+\tt{collect}, \tt{pass}, and \tt{bail} will be typed using a special "glue" type, to avoid certain
+nonsense situations - we will discuss the "glue" type later. For now, understand \tt{pass} and
-\begin{table}
+\tt{bail} to impose no type restrictions, and to be defined to unify with any \tt{collect}
-  \renewcommand{\arraystretch}{3.5}
+expression.
-  \[ \begin{array}{r c}
-    \textbf{TAUT} & {\begin{prooftree}
-        \infer 0 {A, (x :: \sigma \bs \bar{x}) \proves x :: \sigma \bs \bar{x}}
-    \end{prooftree}} \\
-    \textbf{TAUT-INST} & {\begin{prooftree}
-        \infer 0 {A, (x ::_i \sigma \bs \bar{x}) \proves x :: \sigma \bs \bar{x}}
-    \end{prooftree}} \\
-    \textbf{GEN} & {\begin{prooftree}
-        \hypo{\alpha \text{ not free in } A}
-        \hypo{A \proves e :: \sigma \bs \bar{e}}
-        \infer 2 {A \proves e :: \forall \alpha . \sigma \bs \bar{e}}
-    \end{prooftree}} \\
-    \textbf{SPEC} & {\begin{prooftree}
+\begin{figure*}[ht]
-        \hypo{A \proves e :: \forall \alpha . \sigma \bs \bar{e}}
+\begin{center}
-        \infer 1 {A \proves e :: \left[ \alpha \backslash \tau \right] \sigma \bs \bar{e}}
+\begin{tabular}{c}
-    \end{prooftree}} \\
+\begin{lstlisting}[mathescape=true]
+let
+  for :: $\forall \gamma . \forall \iota .$
+    (item ::$_i$ $\gamma$ $\bs$ item$_\gamma$).
+    (item ::$_i$ $\iota$ $\bs$ item$_\iota$).
+    (next ::$_i$ $\gamma$ $\arrow$ (item$_\gamma$ * $\gamma$) option $\bs$ next$_\gamma$).
+    (insert ::$_i$ $\iota$ $\arrow$ item$_\iota$ $\arrow$ $\iota$).
+    $\gamma$ $\arrow$ $\iota$ $\arrow$ (item$_\gamma$ $\arrow$ item$_\iota$) $\arrow$ $\iota$
+    = fn collection iota expr =>
+     case (next collection)
+       of None => iota
+        | Some (item, rest) =>
+            case (expr item)
+              of Bail => iota
+               | Pass => for rest iota expr
+               | Collect item' => for rest (insert iota item') expr
+  collect :: $\forall \iota .$ (item ::$_i$ $\bs$ item$_\iota$). item$_\iota$ $\arrow$ item$_\iota$ collect
+    = fn item => Collect item
+  pass :: $\forall \iota .$ $\iota$ collect = Pass
+  bail :: $\forall \iota .$ $\iota$ collect = Bail
+  acc :: $\forall \iota .$ (default ::$_i$ $\iota$ $\bs$ default$_\iota$). $\iota$ = default
+in
+  for <collection> acc (fn <var> => <expr>)
+end
+\end{lstlisting}
+\end{tabular}
+\caption{Translation of loop design to intermediate language}
+\label{translation}
+\end{center}
+\end{figure*}
-    \textbf{ABS} & {\begin{prooftree}
+\section{Translation}
-        \hypo{A, (x :: \tau' \bs x) \proves e :: \tau \bs \bar{e}}
+Figure \ref{translation} shows the general translation of a loop construct to Wadler \& Blott's
-        \infer 1 {A \proves (\lambda x . e) :: (\tau' \arrow \tau) \bs (\lambda x . \bar{e})}
+\cite{typeclasses} intermediate language.
-    \end{prooftree}} \\
-    \textbf{APP} & {\begin{prooftree}
+Here in the type signatures of \tt{collect}, \tt{pass}, and \tt{bail} we see the "glue" type alluded
-        \hypo{A \proves e :: (\tau' \arrow \tau) \bs \bar{e}}
+to earlier, $\alpha \tt{ collect}$. Under the hood, an $\alpha \tt{ collect}$ is exactly isomorphic
-        \hypo{A \proves e' :: \tau' \bs \bar{e}'}
+to an $\alpha \tt{ option}$; the reason for its existence is to keep these syntactic forms distinct
-        \infer 2 {A \proves (e\; e') :: \tau \bs (\bar{e}\; \bar{e}')}
+from, for example, the forms for \tt{option}s.
-    \end{prooftree}} \\
-    \textbf{LETREC} & {\begin{prooftree}
+The type predicates in \tt{for} encode the requirement that $\gamma$ be an \tt{Iterable} and that
-        \hypo{A, (x :: \alpha \bs x) \proves e :: \sigma \bs \bar{e}}
+$\iota$ be a \tt{Collectible}: note the use of \tt{next} and \tt{insert} from each of those type
-        \hypo{A, (x :: \sigma \bs x) \proves e' :: \tau \bs \bar{e}'}
+classes. Under those requirements, the body of \tt{for} is a simple recursive construct, relying on
-        \infer 2 {A \proves (\textbf{let } x = e \textbf{ in } e') :: \tau \bs
+the other three keywords to control its execution.
-          (\textbf{let } x = \bar{e} \textbf{ in } \bar{e}')}
-    \end{prooftree}} \\
-    \textbf{OVER} & {\begin{prooftree}
+\tt{collect}, \tt{pass}, and \tt{bail} have rather uninteresting definitions: as discussed, all that
-        \hypo {A, (x ::_o \sigma) \proves e' :: \tau \bs \bar{e}'}
+is necessary to ensure here is that they remain distinct from other syntactic forms (so that the
-        \infer 1 {A \proves (\textbf{over } x :: \sigma \textbf{ in } e') :: \tau \bs \bar{e}'}
+execution of \tt{for} is uniquely controlled by these three constructs).
-    \end{prooftree}} \\
-    \textbf{INST} & {\begin{prooftree}
+Next, we define a value \tt{acc} that simply serves to hold the value of \tt{default}, from the
-        \hypo{(x ::_o \sigma) \in A}
+instance of \tt{Collectible} on \tt{<ty>}.
-        \hypo{A, (x ::_i \sigma \bs x_\sigma) \proves e :: \sigma \bs \bar{e}}
-        \hypo{A, (x ::_i \sigma \bs x_\sigma) \proves e' :: \tau \bs \bar{e}'}
-        \infer 3 {A \proves (\textbf{inst } x :: \sigma = e \textbf{ in } e') :: \tau \bs
-          (\textbf{let } x_\sigma = \bar{e} \textbf{ in } \bar{e}')}
-    \end{prooftree}} \\
-    % TODO: INST-PRED
+The last important piece of this translation is that we wrap \tt{<expr>} in a function, binding the
+loop variable. As discussed previously, we assume \tt{<expr>} to contain references to the loop
+variable; here is where that open binding is closed.
-    \textbf{REL} & {\begin{prooftree}
+Given these five definitions, we can translate any loop written in the surface language to an
-        \hypo{(x ::_o \sigma) \in A}
+expression in the intermediate language. From the intermediate language, we leverage the existing
-        \hypo{A \proves e :: (x :: \tau) . \rho \bs \bar{e}}
+infrastructure of Wadler \& Blott \cite{typeclasses} to translate the rest of the way down, at which
-        \hypo{A \proves x :: \tau \bs \bar{e}'}
+point we are left with an ordinary ML expression.
-        \infer 3 {A \proves e :: \rho \bs (\bar{e}\; \bar{e}')}
-    \end{prooftree}} \\
-  \end{array} \]
-\end{table}
-% \begin{align*}
+\section{Sample programs}
-%     \Phi(\forall \alpha . \sigma) &= \Phi(\sigma) \\
+Given the design-based nature of this research, we would be remiss to not discuss some of the more
+elegant programs this design can produce. The simplest form of iteration to encode is a \tt{map},
+which our design can do readily:
+\begin{lstlisting}[mathescape=true]
+for x in [1, 2, 3, 4] $\arrow$ string list
+  collect (toString x)
+\end{lstlisting}
+In fact, \tt{map} can even be implemented in this discipline:
+\begin{lstlisting}[mathescape=true]
+val map = fn f xs =>
+  for x in xs $\arrow$ $\alpha$ list
+    collect (f x)
+\end{lstlisting}
-%          \Phi((x :: \tau) . \rho) &= \Phi(\rho) \\
+\tt{filter} expressions can also be expressed, as we saw in the motivation of the \tt{pass} and
+\tt{bail} keywords. Below is the implementation of \tt{filter} itself:
+\begin{lstlisting}[mathescape=true]
+val filter = fn pred xs =>
+  for x in xs $\arrow$ $\alpha$ list
+    if pred x
+    then collect x
+    else pass
+\end{lstlisting}
-%         \Phi((x :: \tau) . \tau') &= ((x :: \tau), \tau') \\
+\tt{fold} itself cannot be implemented in this discipline, due to the use of a type to provide
-% \end{align*}
+the initial accumulator value (through the \tt{default} type class member) rather than requiring
+that the programmer pass the value themselves. However, we can readily express \tt{fold}
+expressions:
+\begin{lstlisting}[mathescape=true]
+instance Collectible string where
+  type item = string
+  val default = ""
+  val insert = concat
+val strs = ["hi ", "world", "!"]
+for x in strs $\arrow$ string
+  collect x
+\end{lstlisting}
+In this discipline, \tt{fold} expressions will usually have a very simple body: the interesting
+parts come in the type annotations for the result collection. This imposes a much lower cognitive
+load than ordinary \tt{fold} expressions, which are notoriously hard to grasp (especially for new
+programmers).
+Now with a command of the power of this discipline, let us build a small lexer for a prefix
+calculator language. The grammar is as follows:
+\begin{align*}
+   \nu &= 1, 2, 3, \ldots \tag{numbers}\\
+  \chi &= + \mid - \mid * \mid \div \tag{operators}\\
+     e &= (\chi \; \nu \; \nu) \mid \nu \tag{expressions}\\
+\end{align*}
+Assume an intuitive ML datatype representing expressions in this language. Now, we can implement our
+scanner:
+\begin{lstlisting}[mathescape=true]
+val scan = fn s =>
+  for c in s $\arrow$ token list
+    case c
+      of #"(" => collect LParen
+       | #")" => collect RParen
+       | #"+" => collect Add
+       | #"-" => collect Sub
+       | #"*" => collect Mul
+       | #"$\div$" => collect Div
+       | (numeric n) =>
+           collect (Num n)
+       | #"$\text{\textvisiblespace}$" => pass
+       | _ => bail
+\end{lstlisting}
-% TODO: type this up in non-beamer form
+\section{Limitations}
-% \begin{frame}{A new type grammar}
+The use of type classes imposes one major limitation on this design: it is not possible to have
-%   Wadler \& Blott write type schemes as follows:
+multiple instances of a class on the same type in scope simultaneously. For example, a programmer
+desiring to iterate over a search tree in both pre-order and in-order fashion must work around our
+design; each kind of traversal would require its own instance of \tt{Iterable}. The same limitation
+is present with \tt{Collectible}: for example, one cannot have a definition of \tt{collect} on lists
+that represents both \tt{cons} and \tt{append} (i.e. operating on an $\alpha$ and an $\alpha \tt{
+list}$ respectively) in scope simultaneously. Likely, this problem can be solved by simply limiting
+the scope of instances to allow overriding.
-%   \begin{align*}
+Another limitation is that in this current design, nested loops are not entirely well-founded.
-%     \tau &::= (\tau \arrow \tau') \mid \alpha \mid \chi (\tau_1 \ldots \tau_n) \\
+Consider the following program to take the Cartesian product of two lists:
-%     \rho &::= (x :: \tau).\rho \mid \tau \\
+\begin{lstlisting}[mathescape=true]
-%     \sigma &::= \forall \alpha . \sigma \mid \rho
+for x in xs $\arrow$ ($\alpha$ * $\beta$) list
-%   \end{align*}
+  for y in ys $\arrow$ ?
+    collect (x, y)
+\end{lstlisting}
+What is the collection type of the inner loop? The trouble is that in nested loops, the programmer
+usually does not want to build up an inner collection, but rather use \emph{all} levels of nesting
+in tandem to build up one outer collection. Nested iteration will need a careful treatment in future
+work.
-%   Then, types are written simply $x :: \sigma$.
-% \end{frame}
+\bibliographystyle{unsrt}
+\bibliography{main}
-% \begin{frame}{A new type grammar}
-%   We propose the following modification:
-%   \begin{align*}
+\appendix
-%     \tau &::= (\tau \arrow \tau') \mid \alpha \mid \chi (\tau_1 \ldots \tau_n) \\
-%     \rho &::= (x :: \tau).\rho \mid \; \varnothing \\
-%     \sigma &::= \forall \alpha . \sigma \mid \; \varnothing
-%   \end{align*}
-%   Then, types are written $x :: \sigma \rho \, \tau$.
+\section{\tt{INST-PRED}: Making translation algorithmic}
-% \end{frame}
+\begin{figure*}[ht]
+\begin{center}
+\begin{tabular}{c}
+\begin{lstlisting}[mathescape=true]
+over eq $::$ $\forall \alpha .$ $\alpha$ $\arrow$ $\alpha$ $\arrow$ bool in
+inst eq $::$ int $\arrow$ int $\arrow$ bool = intEq in
+inst eq $::$ char $\arrow$ char $\arrow$ bool = charEq in
+inst eq $::$ $\forall \alpha . \forall \beta .$
+               (eq $::$ $\alpha$ $\arrow$ $\alpha$ $\arrow$ bool).
+               (eq $::$ $\beta$ $\arrow$ $\beta$ $\arrow$ bool).
+               ($\alpha$ * $\beta$ $\arrow$ $\alpha$ * $\beta$ $\arrow$ bool)
+               = $\lambda$x.$\lambda$y.
+                 (eq (fst x) (fst y)) andalso
+                   (eq (snd x) (snd y))
+               in
+eq (1, #"a") (1, #"a")
+\end{lstlisting}
+\end{tabular}
+\caption{Pair equality, formalized (from Wadler \& Blott \cite{typeclasses})}
+\label{pair-intermediate}
+\end{center}
+\end{figure*}
+While investigating type classes as a research foundation, we discovered that one of the inference
+rules in Wadler \& Blott's paper \cite{typeclasses} - \tt{PRED} - is non-algorithmic: it matches any
+term with a predicated type, rather than a specific syntactic form. As a consequence, the rule has
+to be attempted at every point in the inference process, lest the translated output be incorrect.
+Take the formalized definition of pair equality in figure \ref{pair-intermediate}: intuitively, it
+should translate to the representation in figure \ref{pair-pred-inst-translation}. However,
+following Wadler \& Blott's rules strictly \cite{typeclasses}, the \tt{SPEC}, \tt{ABS}, and
+\tt{COMB} rules are triggered first, and \tt{PRED} is not invoked until deep inside the expression,
+at which point the dictionary-passing bindings are put in the wrong place (figure
+\ref{pair-pred-translation}).
+In order to arrive at the correct translation, one either must make precarious decisions about the
+order of rule application, or create an algorithmic rule to augment the system. We propose the
+latter.
+First, to formulate this rule, we propose a predicate-stripping relation $\pi$, defined below:
+\begin{align*}
+                  \pi(\varnothing) &= \; \varnothing \\
+\pi(\rho_1 . \rho_2 \ldots \rho_n) &= (\rho_n, \; \rho_1 . \rho_2 \ldots \rho_{n - 1})
+\end{align*}
+Intuitively, this rule simply returns the last predicate in a list with the remaining predicates in
+a list.
+Next, and again to ease the formulation of our new rule, we propose a slight modification to the
+type grammar presented by Wadler \& Blott \cite{typeclasses}:
+\begin{align*}
+    \tau &= \tau \arrow \tau' \mid \alpha \mid \chi (\tau_1 \ldots \tau_n) \tag{bare types}\\
+    \rho &= \varnothing \mid (x :: \tau) . \rho \tag{type predicates}\\
+  \sigma &= \varnothing \mid \forall \alpha . \sigma \tag{type schemes}\\
+         & \sigma \rho \tau \tag{types}
+\end{align*}
+All that is important to note here is the lack of fall-through cases for $\rho$ and $\sigma$, and
+the necessity of including all three $\sigma$, $\rho$, and $\tau$ when expressing a type.
+\begin{figure*}[ht]
+\begin{center}
+\begin{tabular}{c}
+\begin{lstlisting}[mathescape=true]
+let eq$_\tt{int}$ = intEq in
+let eq$_\tt{char}$ = charEq in
+let eq$_{\alpha * \beta}$ = $\lambda$x.$\lambda$y.
+             ($\lambda$eq$_\alpha$.eq$_\alpha$ (fst x) (fst y)) andalso
+               ($\lambda$eq$_\beta$.eq$_\beta$ (snd x) (snd y))
+             in
+eq$_{\alpha * \beta}$ eq$_\tt{int}$ eq$_\tt{char}$ (1, #"a") (1, #"a")
+\end{lstlisting}
+\end{tabular}
+\caption{Pair equality, translated using \tt{PRED} (from Wadler \& Blott \cite{typeclasses})}
+\label{pair-pred-translation}
+\end{center}
+\end{figure*}
+\begin{figure*}[ht]
+\begin{center}
+\begin{tabular}{c}
+\begin{lstlisting}[mathescape=true]
+let eq$_\tt{int}$ = intEq in
+let eq$_\tt{char}$ = charEq in
+let eq$_{\alpha * \beta}$ = $\lambda$eq$_\alpha$.$\lambda$eq$_\beta$.$\lambda$x.$\lambda$y.
+             (eq$_\alpha$ (fst x) (fst y)) andalso
+               (eq$_\beta$ (snd x) (snd y))
+             in
+eq$_{\alpha * \beta}$ eq$_\tt{int}$ eq$_\tt{char}$ (1, #"a") (1, #"a")
+\end{lstlisting}
+\end{tabular}
+\caption{Pair equality, translated using \tt{PRED-INST}}
+\label{pair-pred-inst-translation}
+\end{center}
+\end{figure*}
+Now, to arrive at the desired translation, we propose a \tt{PRED-INST} rule that specifically
+matches \tt{inst} declarations with predicated types, defined in figure \ref{pred-inst}.
+\begin{figure*}[ht!]
+\begin{center}
+\[
+\begin{prooftree}
+  \hypo{(x ::_o \_ \_ \_) \in A}
+  \infer[no rule] 1 {(x' ::_o \_ \_ \_)  \in A}
+  \infer[no rule] 1 {\pi(\rho) = ((x' :: \tau'), \bar{\rho})}
+  \infer[no rule] 1{A, (x' :: \tau' \bs x_{\tau'}) \proves e :: \sigma \bar{\rho} \; \tau \bs \bar{e}}
+  \infer[no rule] 1 {A, (x' :: \tau' \bs x_{\tau'}) \proves e' :: \sigma'' \rho'' \tau'' \bs \bar{e}'}
+  \infer 1 {
+    A \proves (\tt{inst } x :: \sigma \rho \tau = e \tt{ in } e') ::
+    \sigma'' \rho'' \tau'' \bs (\tt{inst } x ::
+  \sigma \bar{\rho} \tau = \lambda x'_{\tau'} . \bar{e} \tt{ in } \bar{e}')}
+\end{prooftree}
+\]
+\caption{\tt{PRED-INST}: an algorithmic update of \tt{PRED}}
+\label{pred-inst}
+\end{center}
+\end{figure*}
+\section{LUCA-lang}
+The bulk of the working hours spent on this research revolved around building an experimental
+compiler for the language we propose. As the academic quarter drew to a close, efforts on this front
+had to be stopped in favor of finalizing the research, but the language served as an indispensable
+tool for shaping understanding, especially of topics in type inference and compiler design.
+The language is named \textbf{LUCA}, after Luca Cardelli, whose paper \cite{cardelli} explaining the
+seminal work in polymorphic type inference by Damas \& Milner \cite{damas-milner} was instrumental
+in the genesis of this research. Though the compiler is not currently in a functioning state, the
+work that went into it was of paramount importance. Below is an overview of the most edifying
+experiences of this process.
+\subsection{Polymorphic type inference}
+The process of type unification (that is, the process by which $\forall \alpha . \alpha \arrow
+\alpha$ is deemed to be the same as $\tt{int} \arrow \tt{int}$) constitutes a substantial leap of
+understanding beyond the standard undergraduate curriculum in programming language theory. The
+implementation of Algorithm W \cite{damas-milner} eventually grew into a full-fledged set of
+compiler phases.
+Annotation walks an abstract syntax tree and produces an annotated AST; the process itself is "dumb"
+in that it does not perform any inference, but rather serves as an aid for the next phase,
+collection. Collection walks an annotated AST and produces a constraint set, containing pairs of
+types that must unify for the program to have a valid type. Unification is the final phase, which
+folds over a constraint set, generating a substitution which, when applied, will generate the
+\emph{most general} type for the program \cite{damas-milner}. Each of these three phases is
+implemented in their own SML structure, and can be found in \tt{annotate.sml}, \tt{collect.sml}, and
+\tt{unify.sml}, respectively.
+\subsection{Monadic programming}
+At a certain point, the repetitious nature of recursive-descent parsing became too much to bear.
+Monadic programming offered an elegant solution. Below is a snippet of positively beautiful code
+from the \textbf{LUCA} parser:
+\begin{lstlisting}
+and absExpr ts = (
+  litKey "fn"        >>
+  (many1 identifier) >>= (fn xs =>
+  lit Scan.FatArrow  >>
+  expr               >>= (fn e =>
+  pure (foldr AST.Abs e xs)))) ts
+\end{lstlisting}
+The monadic sequencing operators \tt{>>=}, \tt{>>}, and \tt{pure} allow computations to be built "by
+description". This parser reads as a very minimal translation of the term grammar for abstractions
+($\tt{fn } x \ldots \Rightarrow e$). Listing out the steps in plain English, one might write:
+\begin{itemize}
+  \item Gobble the "fn" keyword. If that works,
+  \item Parse at least one identifier (the function arguments). If that works,
+  \item Gobble an arrow. If that works,
+  \item Parse the function body. If that works,
+  \item Assemble a curried stack of function ASTs, using the list of identifiers and function body
+    parsed above.
+\end{itemize}
+The beauty of monadic programming is that each step of plain English translates directly - almost
+word for word - to a line of monadic code. In addition to this novel method of describing
+algorithms, reimplementing Haskell's \tt{Control.Monad} library provided a glimpse into some of the
+more powerful constructs available to ML programmers, chiefly \tt{Functor}. Monadic programming is
+used throughout the entire \textbf{LUCA} compiler, with several different monadic types. The most
+interesting uses of monadic programming can be found in \tt{parse.sml} and \tt{collect.sml}.
+\subsection{Shunting yard algorithm}
+When the time came to parse type annotations, the primary limitation of monadic parsing became a
+major roadblock: the inability to cleanly parse infix and postfix grammars. The \textbf{LUCA}
+compiler has a separate parser for type expressions, which uses Dijkstra's shunting yard algorithm
+\cite{shunting}, so named for its visual resemblance to a railroad shunting yard. The algorithm
+encodes operator precedence - not an easy task in monadic parsing - by maintaining a separate output
+queue and operator stack. Dijkstra's original presentation \cite{shunting} described a conversion
+from mixed-fixity notation to RPN. However, since we required conversion to an AST, we modified the
+algorithm, changing the output queue to an output stack. "Pushing" to the output queue, then,
+instead means applying the current operator to the top $n$ elements of the output stack (where $n$
+is the operator's arity), then placing the result of that application on the top of the output
+stack. By the time execution terminates, the output stack should be a singleton of one AST,
+representing the parsed expression. The \textbf{LUCA} compiler's type expression parser can be found
+in \tt{tyexpr.sml}.
 \end{document}