math philosophy - How do they verify a verifier of formalized proofs?

Saturday, 30 August 2008

math philosophy - How do they verify a verifier of formalized proofs?

I think the question you are asking - about how can we trust a formal proof - is a very important question. I have spent considerable effort developing software to specifically address this question. You touch on various things I have concentrated on.

It is true that various systems prominent in the formalisation of mathematics - including the HOL systems (HOL4, ProofPower HOL, HOL Light), Isabelle and Coq - are built according to the "LCF style", which means that all deduction must go via a relatively small kernel of trusted source code (implementing the primitive inference rules), and that this greatly reduces the risks of producing unsound proofs on these systems. It would also not be an exaggeration to say that almost everyone working in formal proof is happy with this situation. Indeed, probably most (but not those working on the above systems) feel that resorting to the LCF style is overkill and an unnecessary drain on user execution time and on development effort.

However, there are 3 major problems with this status quo:

A) Most "LCF-style" systems do not implement the LCF-style kernel idea as purely as may be expected. Some systems have back doors to creating "proved theorems", such as importing the statements (but not the proofs) of previously proved results from disk, and trust that the user will not abuse this. Also, to reduce execution time, most systems implement various derivable inference rules as primitives, multiplying up the size of the trusted source code. Also, the kernels of most systems typically incorporate large amounts of supporting code (e.g. for organising theories) and are not particularly clearly implemented, and so are difficult to review. It should be noted that HOL Light does not suffer from any of these problems.

B) The trusted aspects of an LCF-style system is NOT limited to the design/implementation of its LCF-style kernel. Like in all systems, it at least also includes the design of the concrete syntax and the implementation of the pretty printer, since, in practice, the user will only view results displayed in concrete syntax via the pretty printer. However, each system has problems with its concrete syntax and/or pretty printer that allows misleading information to be displayed to the user (e.g. by using irregular variables names, or names that are overloaded). I have found many ways of appearing to prove "false" in these systems! Also, depending on how the system is used, the parser is arguably also a trusted component.

C) The importance of the human process of checking that the intended result has in fact been proved (I call this "proof auditing") is generally greatly underestimated, and in practice often not even carried out at all. As you rightly point out, the axioms and definitions used in a formal proof need to be carefully checked, as well as the statement of the theorem itself. I have known subtle mistakes in definitions to render real formal proofs on real projects completely invalid.

I have developed an open source theorem prover called HOL Zero, that addresses issues A-C above and is designed for use in proof auditing and generally to be as trustworthy as possible. It has a watertight inference kernel, a well-designed concrete syntax and pretty printer, and the source code aims to be as clear and well-commented as possible. However, it should be noted that it does not have the advanced automatic and/or interactive proof facilities of the existing systems I mention above, and is not suited to developing large formal proofs. HOL Zero can be downloaded from here (it needs OCaml and a Unix-like operating system):

http://www.proof-technologies.com/holzero

The concept of checking one system using another is not only philosophically reassuring, but also of pressing need (due to the above issues A-C). As you say, what is needed is the ability to port proof objects between systems (the so called "de Bruijn criterion"). Strictly speaking, the de Bruijn criterion is as you state your requirement - the ability to capture a proof as an object (e.g. a text file) - rather than the LCF style (but let's not get too philosophical about the equivalence of these approaches). Anyway, there are some practical issues here:

The "dumb program" that you refer to needs to be surprisingly sophisticated, otherwise it loses much of its purpose. If it is just an LCF-style kernel, the data it outputs will be too slow to review for large projects. As you say, it needs to be human-friendly - a decent pretty printer is a practical necessity. Also, to make proof exporting (see next item) work in practice, it needs to work at least at a slightly higher level than the kernel, and so some supporting theory is required to be built. So yes, a dumb program is required, which should be as easy to understand as possible, but it is more challenging to build than a few lines of code.

Capturing proof objects in a suitably efficient way is a non-trivial exercise. The work that others mention above successfully port things like the base system of HOL Light, but are completely incapable of handling something the scale of Hale's HOL Light proof of the Jordan Curve Theorem, let alone Hale's Flyspeck Project (using HOL Light to check his non-formal proof of the Kepler Conjecture).

Some sort of neat and trivial correspondence of equivalent theory between systems is useful. This is better than importing language statements as huge expressions, in terms of some highly-complex embedding of one notation inside another, which would either greatly increase the complexity of the checking system or make its human usage difficult.

HOL Zero is primarily aimed at the "dumb program" proof checker role. The idea is it will import and replay proofs that have been exported from other HOL systems. I have implemented a proof exporting mechanism which, unlike others' mechanisms, handles with ease large proofs such as Hale's Jordan Curve proof and the (almost-complete) Flyspeck Project. I am currently working on a proof importing mechanism for HOL Zero. (Note that a former proof importing mechanism I developed worked on an old version of HOL Zero and successfully ported Hale's Jordan Curve proof and Harrison's proof of the consistency of the HOL Light kernel.)

Answer Desk

Saturday, 30 August 2008

math philosophy - How do they verify a verifier of formalized proofs?

No comments:

Post a Comment