Haskell proof checker for FOL with equality

Question

Haskell proof checker for FOL with equality

306 Views Asked by Bumbble Comm At 25 Mar 2026 - 2:20

Ignoring issues of efficiency, is this a correct implementation in Haskell of a proof checker for first order logic with equality? I am especially concerned about the subIn, admitsVar and admitsTerm functions. Also, do any of the predicate rules need to check that a variable is not free in the context?

{-
Hilbert style first order mathematical logic with equality using named variables.

Modified from "Handbook of Practical Logic and Automated Reasoning" by John Harrison.
https://www.cl.cam.ac.uk/~jrh13/atp/index.html
-}


{-
Func "c" [] : A constant named "c"
Func "f" [v] : A function named "f" of one variable v
-}
data Term = Var String
          | Func String [Term]
            deriving (Show, Eq)

{-
Pred "P" [] : A propositional variable named "P"
Pred "Eq" [s, t] : s = t
-}
data Formula = Pred String [Term]
             | Not Formula
             | Imp Formula Formula
             | Forall String Formula
               deriving (Show, Eq)

type Context = [Formula]

data Theorem = Theorem Context Formula
               deriving (Show, Eq)

{-
From "First Order Mathematical Logic" by Angelo Margaris:

An occurrence of a variable $v$ in a formula $P$ is bound if and only if
it occurs in a subformula of $P$ of the form $\forall v Q$. An occurrence
of $v$ in $P$ is free if and only if it is not a bound occurrence. The
variable $v$ is free or bound in $P$ according as it has a free or bound
occurrence in $P$.

If $P$ is a formula, $v$ is a variable, and $t$ is a term, then $P(t/v)$ is
the result of replacing each free occurrence of $v$ in $P$ by an occurrence
of $t$.

If $v$ and $u$ are variables and $P$ is a formula, then $P$ admits $u$ for $v$
if and only if there is no free occurrence of $v$ in $P$ that becomes a
bound occurrence of $u$ in $P(u/v)$. If $t$ is a term, then $P$ admits $t$ for
$v$ if and only if $P$ admits for $v$ every variable in $t$.
-}

-- occursIn v t = there exists an occurrence of v in t.
occursIn :: String -> Term -> Bool
occursIn v (Var v') = v == v'
occursIn v (Func _ terms) = any (occursIn v) terms

-- freeIn v p = there exists an occurrence of v in p that is free.
freeIn :: String -> Formula -> Bool
freeIn v (Pred _ terms) = any (occursIn v) terms
freeIn v (Not p) = freeIn v p
freeIn v (Imp p q) = freeIn v p || freeIn v q
freeIn v (Forall v' p) = v /= v' && freeIn v p

{-
subIn p t v = p(t/v) = the result of replacing each free occurrence of
v in p by an occurrence of t.
-}
subIn :: Formula -> Term -> String -> Formula
subIn (Pred name terms) t v = Pred name (map (\t' -> subInTerm t' t v) terms) where
  {-
  subInTerm t' t v = t'(t/v) = the result of replacing each occurrence of
  v in t' by an occurrence of t.
  -}
  subInTerm :: Term -> Term -> String -> Term
  subInTerm (Var v') t v = if v == v' then t else Var v'
  subInTerm (Func name terms) t v = Func name (map (\t' -> subInTerm t' t v) terms)
subIn (Not p) t v = Not (subIn p t v)
subIn (Imp p q) t v = Imp (subIn p t v) (subIn q t v)
subIn (Forall v' p) t v = if v == v' then Forall v' p else Forall v' (subIn p t v)

{-
admitsVar p u v = p admits u for v = there is no free occurrence of 
v in p that becomes a bound occurrence of u in p(u/v).
-}
admitsVar :: Formula -> String -> String -> Bool
admitsVar p u v = go p u v [] where
  go :: Formula -> String -> String -> [String] -> Bool
  go (Pred _ terms) u v binders = not (any (occursIn v) terms)
                               || elem v binders
                               || notElem u binders
  go (Not p) u v binders = go p u v binders
  go (Imp p q) u v binders = go p u v binders && go q u v binders
  go (Forall v' p) u v binders = go p u v (v' : binders)

{-
admitsTerm p t v = p admits for v every variable in t.
-}
admitsTerm :: Formula -> Term -> String -> Bool
admitsTerm p (Var u) v = admitsVar p u v
admitsTerm p (Func _ terms) v = all (\t -> admitsTerm p t v) terms


-- Assumption

assume :: Context -> Formula -> Theorem
assume gamma p = if elem p gamma then Theorem gamma p else error "assume"


-- Propositional calculus

-- |- (p -> (q -> p))
prop_1 :: Context -> Formula -> Formula -> Theorem
prop_1 gamma p q = Theorem gamma (p `Imp` (q `Imp` p))

-- |- ((p -> (q -> r)) -> ((p -> q) -> (p -> r)))
prop_2 :: Context -> Formula -> Formula -> Formula -> Theorem
prop_2 gamma p q r = Theorem gamma ((p `Imp` (q `Imp` r)) `Imp` ((p `Imp` q) `Imp` (p `Imp` r)))

-- |- ((~p -> ~q) -> (q -> p))
prop_3 :: Context -> Formula -> Formula -> Theorem
prop_3 gamma p q = Theorem gamma (((Not p) `Imp` (Not q)) `Imp` (q `Imp` p))

-- |- p & |- (p -> q) => |- q
mp :: Theorem -> Theorem -> Theorem
mp (Theorem gamma p) (Theorem gamma' (p' `Imp` q))
       = if gamma == gamma' && p == p' then Theorem gamma q else error "mp"
mp _ _ = error "mp"


-- Predicate calculus

-- |- p => |- forall v. p
gen :: Theorem -> String -> Theorem
gen (Theorem gamma p) v = Theorem gamma (Forall v p)

-- |- ((forall v. (p -> q)) -> (forall v. p) -> (forall v. q))
pred_1 :: Context -> String -> Formula -> Formula -> Theorem
pred_1 gamma v p q = Theorem gamma ((Forall v (p `Imp` q)) `Imp` ((Forall v p) `Imp` (Forall v q)))

-- |- (forall v. p -> p [t/v]) provided p admits t for v
pred_2 :: Context -> String -> Formula -> Term -> Theorem
pred_2 gamma v p t = if admitsTerm p t v then Theorem gamma ((Forall v p) `Imp` (subIn p t v)) else error "pred_2"

-- |- (p -> forall v. p) provided v is not free in p
pred_3 :: Context -> String -> Formula -> Theorem
pred_3 gamma v p = if not (freeIn v p) then Theorem gamma (p `Imp` (Forall v p)) else error "pred_3"


-- Equality

-- |- t = t
eq_1 :: Context -> Term -> Theorem
eq_1 gamma t = Theorem gamma (Pred "Eq" [t, t])

-- |- s1 = t1 ==> ... ==> sn = tn ==> f(s1,..,sn) = f(t1,..,tn)
eq_2 :: Context -> String -> [Term] -> [Term] -> Theorem
eq_2 gamma f ss ts = if length ss == length ts then
                     -- eqs = [s1 = t1, ..., sn = tn]
                     let eqs = zipWith (\s t -> (Pred "Eq" [s, t])) ss ts in
                     -- z = f(s1,..,sn) = f(t1,..,tn)
                     let z = Pred "Eq" [(Func f ss), (Func f ts)] in
                     Theorem gamma (foldr Imp z eqs)
                     else error "eq_2"

-- |- s1 = t1 ==> ... ==> sn = tn ==> P(s1,..,sn) ==> P(t1,..,tn)
eq_3 :: Context -> String -> [Term] -> [Term] -> Theorem
eq_3 gamma p ss ts = if length ss == length ts then
                     -- eqs = [s1 = t1, ..., sn = tn]
                     let eqs = zipWith (\s t -> (Pred "Eq" [s, t])) ss ts in
                     -- z = P(s1,..,sn) ==> P(t1,..,tn)
                     let z = (Pred p ss) `Imp` (Pred p ts) in
                     Theorem gamma (foldr Imp z eqs)
                     else error "eq_3"

```

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

I think code review may offer better responses.

But here's a thought worth exploring: consider unit testing these functions you are nervous about. Let me walk through how I would unit test subIn. Here's your code:

{-
subIn p t v = p(t/v) = the result of replacing each free occurrence of
v in p by an occurrence of t.
-}
subIn :: Formula -> Term -> String -> Formula
subIn (Pred name terms) t v = Pred name (map (\t' -> subInTerm t' t v) terms) where
  {-
  subInTerm t' t v = t'(t/v) = the result of replacing each occurrence of
  v in t' by an occurrence of t.
  -}
  subInTerm :: Term -> Term -> String -> Term
  subInTerm (Var v') t v = if v == v' then t else Var v'
  subInTerm (Func name terms) t v = Func name (map (\t' -> subInTerm t' t v) terms)
subIn (Not p) t v = Not (subIn p t v)
subIn (Imp p q) t v = Imp (subIn p t v) (subIn q t v)
subIn (Forall v' p) t v = if v == v' then Forall v' p else Forall v' (subIn p t v)

Using HUnit, I would make the following observations:

Observation 1: Refactor subInTerm. OK, you've got two functions intertwined. This makes it harder to test, so I'd suggest refactoring subInTerm out as a separate function. (You could choose not to, but that would force us to write more tests to check subIn (Pred P terms) t v works as expected.)

Observation 2. subInTerm (Var v') t v has a single if expression and a single condition. We should have one unit test for the case when the condition is true, another unit test when the condition is false.

I would then suggest:

import Test.HUnit

subInMatchingVarTest1 = TestCase (assertEqual
  "subInTerm x y z = x,"
  (subInTerm (Var "x") (Var "y") "z")
  (Var "x"))

subInMatchingVarTest2 = TestCase (assertEqual
  "subInTerm x y x = y,"
  (subInTerm (Var "x") (Var "y") "x")
  (Var "y"))

{- more test cases to come -}

subInTermTests = TestList [
  TestLabel "VarMismatch" subInMatchingVarTest1,
  TestLabel "VarMatch" subInMatchingVarTest2,
  {-- more cases to come... --}
]

subInTerm Case 2: Functions. The other case for subInTerm simply replaces the arguments of a function, otherwise leaving the function untouched. The arguments of a Func can either be a Var or a Func, but by structural induction we only need to worry about the case when Func's parameters are Var instances. (If a Func parameter is another Func, then by structural induction things will work out fine provided the base cases hold; i.e., if a Func of Var arguments substitutes fine.)

Thus I'd add another few unit tests (although one suffices, paranoia drives me to add a few more):

subInMatchingFuncTest1 = TestCase (assertEqual
  "subInTerm f (x y z) w x = f (w y z),"
  (subInTerm (Func "f" [Var "x", Var "y", Var "z"]) (Var "w") "x")
  (Func "f" [Var "w", Var "y", Var "z"]))

subInMatchingFuncTest2 = TestCase (assertEqual
  "subInTerm f (x y z) w y = f (x w z),"
  (subInTerm (Func "f" [Var "x", Var "y", Var "z"]) (Var "w") "y")
  (Func "f" [Var "x", Var "w", Var "z"]))

subInMatchingFuncTest3 = TestCase (assertEqual
  "subInTerm f (x y z) w z = f (x y w),"
  (subInTerm (Func "f" [Var "x", Var "y", Var "z"]) (Var "w") "z")
  (Func "f" [Var "x", Var "y", Var "w"]))

subInTermTests = TestList [
  TestLabel "VarMismatch" subInMatchingVarTest1,
  TestLabel "VarMatch" subInMatchingVarTest2,
  subInMatchingFuncTest1,
  subInMatchingFuncTest2,
  subInMatchingFuncTest3
]

Remark 2.1. You may want to also consider a unit test like to handle a case like subInTerm (Func "f" [(Var "x")]) (Var "y") "f", just to make sure you don't end up with Func "y" [(Var "x")] or something similar.

When generating test cases, think about how you can break your code and then try to create a test case for each situation.

Observation 3. Of the three remaining cases for subIn, only subIn (Forall v' p) t v needs testing (the Not and Imp cases are satisfied by structural induction). There are two subcases to consider for subIn (Forall ...) ...:

subIn (Forall (Var "x") p) t "x", i.e., when the bound variable v' in the Forall coincides with the variable x being substituted; and
subIn (Forall (Var "y") p) t "x" when the bound variable v' differs from the variable x being substituted in.

subInForallTest1 = TestCase (assertEqual
  "subIn (Forall x P[t]) t' x = Forall x P[t],"
  (subIn (Forall (Var "x") Pred "P" [Func "f" [(Var "x"),
                                               (Var "y"),
                                               (Var "z")]])
    (Var "w")
    "x")
  (Forall (Var "x") Pred "P" [Func "f" [(Var "x"), (Var "y"), (Var "z")]]))

subInForallTest2 = TestCase (assertEqual
  "subIn (Forall x P[t]) t' y = Forall x P[subIn t t' y],"
  (subIn (Forall (Var "x") Pred "P" [Func "f" [(Var "x"),
                                               (Var "y"),
                                               (Var "z")]])
    (Var "w")
    "y")
  (Forall (Var "x") Pred "P" [Func "f" [(Var "x"), (Var "w"), (Var "z")]]))

subInForallTest3 = TestCase (assertEqual 
  "subIn (Forall x P[t]) t' z = Forall x P[subIn t t' z]," 
  (subIn (Forall (Var "x") Pred "P" [Func "f" [(Var "x"),
                                               (Var "y"),
                                               (Var "z")]])
    (Var "w")
    "z")
  (Forall (Var "x") Pred "P" [Func "f" [(Var "x"), (Var "y"), (Var "w")]]))

{- other tests placed here -}

subInTests = TestList [subInForallTest1,
  subInForallTest2,
  subInForallTest3]

Now, given that we tested the subInTerm function, the subIn (Pred ...) case would be handled. But I encourage a little caution, and you may want to add some tests to handle this case (handle special cases like subIn (Pred (Func "f" [])) ... should be invariant, and also "generic" cases which would cover a huge class of situations like subIn (Pred "P" [(Var "x")]) (Var "w") "x" and subIn (Pred "P" [(Var "y")]) (Var "w") "x").

Concluding remarks. The strategy should be to test each possible "flow", and partition the "input space" into equivalence classes, then test each equivalence class with some finite number of representative cases. We can cut down on the number of tests by using structural induction (in Haskell, at least). But this is the generic strategy to test the code.

Testing doesn't prove the absence of bugs, however. Whenever you encounter a bug, you should add a unit test which reflects the bug you've encountered. Run unit tests before pushing commits (I'm assuming you're using git, right?), or set up Jenkins or another "continuous integration" tool to run tests when a commit has been pushed (and email out warnings when the tests have failed).

If you really want greater confidence, consider using QuickCheck, which would automatically generate test cases to see if specified properties hold. Again, this doesn't prove the absence of bugs, but it increases the plausibility that there are no serious bugs.

Haskell proof checker for FOL with equality

There are 1 best solutions below

Related Questions in LOGIC

Related Questions in FIRST-ORDER-LOGIC

Related Questions in AUTOMATED-THEOREM-PROVING

Trending Questions

Popular # Hahtags

Popular Questions