Proof assistants are systems for developing and verifying formal proofs using a computer. Such a formal proof is called a *formalization.* Proof assistants are in between proof checkers, where the user needs to give all details of the proofs, and automated theorem provers, where the computer creates the proofs all by itself. Proof assistants are used both for proofs in computer science - the theory of computer science, as well as the correctness of hardware, software and protocols - as well as for mathematical proofs. The best known proof assistants come from two families: dependent type theory (Coq, Lean, Agda, systems for homotopy type theory), in which the mathematics generally is intuitionistic, and higher order logic (Isabelle, HOL, PVS), where the mathematics is classical.

Proof assistants is an active research subject. One direction of research is to use machine learning to make the systems able to be smarter when helping the user (the so-called "hammers"). Another direction is the development of large mathematical libraries, for example in Coq (mathcomp), Lean (mathlib) and HOL. And then there are many people applying proof assistants in the verification of compilers, operating systems, and processor design.

A prerequisite for working with proof assistants is a good knowledge of logic, and familiarity with lambda calculus. However, more important is a willingness to build intricate formal proofs in a programming-language-like language.

Here the project is simple: select some nice mathematical result (or part of a textbook) and a proof assistant, and then formalize the result using the proof assistant. The research question then would be to study in what way the mathematics would need to be structured to fit the proof assistant, in what way the mathematics deviates from the informal English version, and what technologies in the proof assistant could be developed that would make the formalization easier. Possibly developing such a "proof tool" could be part of the project.

Contact: Freek Wiedijk

The Why3 system allows its user to verify the correctness of software by annotating the source with invariants, and then generating *verification conditions* that then are proved. These proofs can be manual, but generally are delegated to automated theorem provers like SAT solvers. However, the correctness of the verification conditions with respect to the semantics of the programming language is not verified in a proof assistant. The project would be to build a simplified version of Why3 that works inside a proof assistant, in the sense that when the system accepts an annotated program, it actually proves the correctness with respect to the semantics of the language.

Contact: Freek Wiedijk

The Kenzo computer algebra system is a Lisp program that computes homology and homotopy groups in algebraic topology. There is work on formalizing the correctness of the underlying theory in Isabelle and Coq. However, ideally, the system would be implemented *inside* a system like Coq, so one could prove the correctness of the actual code. As a first step, a version of the system could be developed in a typed language like ML or Haskell. Ideally, the types then would correspond to notions from algebraic topology. Possibly a next step could be to port that code to Coq, and to index the type in a way that correspond to the algebraic topology.

Contact: Freek Wiedijk

The Risch algorithm decides whether a symbolic integral has a closed form of a certain kind. This algorithm is very complicated and has been implemented, but there are various cases, and possibly all cases together have never been implemented in a single framework. The project would be to develop a version of this algorithm, or as a start just the underlying theory, in a proof assistant (or as a first step in a typed function programming language).

Contact: Freek Wiedijk

The holy grail of the use of machine learning in formalization, is a system that takes an ordinary English proof in latex, and automatically converts it to a formalization. A first step in this is to delineate where the boundary is between the language of the *proofs* (phrases like: "now assume that") and the language of *terms*, *types* and *statements*. The project would be to take a corpus of English language proofs, annotate part of that manually with these boundaries and train a neural network on this. Then the rest of the corpus would be annotated by taking the output of that network and correcting it to obtain a "ground truth". A possible outcome from this work might be a grammar of the various ways that proof steps are phrased in English, together with statistics about this.

Contact: Freek Wiedijk

There are various efforts to formalize the semantics of the C programming language. However, thus far the C preprocessor has not been taken into account in this work. The project would be to take a much simplified version of the C preprocessor, formalize its behavior, and study how a specification of the .h files, like in the C standard, might be formally defined.

Contact: Freek Wiedijk

Automated theorem provers produce a proof of a *Skolemized* version of the statement that was the input. The theorem that is used here is that a statement has a proof iff the Skolemized version of the statement has a proof. The project would be to make this theorem effective, and implement a program that takes as input the proof of the Skolemized statement, and outputs a proof of the original version. This might give a way to convert the output of the automated theorem prover to natural language, or to a version of the proof that can be checked in a proof assistant, without needing the axiom of choice.

Contact: Freek Wiedijk

We propose the following method for defining the partial recursive functions (over natural numbers) as functions from streams to streams. We consider streams over {0,S,c}, where c denotes a computation step (a tau step in process algebra terminology), so the number n is represented by c...c S c...c S ... S c... c 0 ..., so n times a S followed by a 0, separated by c and followed by arbitrary what. A function takes streams and produces a stream; given a (code for a) partial recursive function f of arity m we define its interpretation as a function (in Coq) taking n streams and producing a stream. All this should be defined in Coq Now we want to prove formally (in Coq), that the set of total functions is not enumerable: there is no function g that takes a (code of a) function f and returns true if f is total and false if f is not.

Contact: Herman Geuvers.