URI:
        _______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
  HTML Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
  HTML   Lean 4: How the theorem prover works and why it's the new competitive edge in AI
       
       
        emih wrote 10 hours 12 min ago:
        Machine learning is definitely enabling writing _proofs_ within a proof
        assistant, and I'm sure it will help to make formal verification more
        viable in the future.
        
        Where it cannot (fully) replace humans, is writing the _theorems_
        themselves. A human has to check that the theorem being proven is
        actually what you were trying to prove, and this is not safe from LLM
        hallucinations. If you ask an LLM, is this bridge safe, and it writes
        `Theorem bridge_is_safe : 1 + 1 = 2.` and proves this theorem, that
        does _not_ mean the bridge is safe...
        
        The article then also makes some wild extrapolations:
        
        > We could imagine an LLM assistant for finance that provides an answer
        only if it can generate a formal proof that it adheres to accounting
        rules or legal constraints.
        
        I guess it's true because you could imagine this, hypothetically. But
        it's not going to happen, because you cannot formalize a financial or
        legal statement in a proof assistant. It's a fundamentally informal,
        real-world thing, and proof assistants are fundamentally for proving
        formal, abstract things.
       
          bwestergard wrote 10 hours 1 min ago:
          Yes.
          
          Here is another way to think of this. We all understand that the
          value of a lawyer in contract negotiations lies not only in drafting
          a document that, when fed to judge, produces the desired outcome.
          Rather, lawyers help clients (and counterparties) decide on what
          their interests consist in.
          
          Developing software is always something of a principal-agent
          coordination problem, and comes with transaction costs.
          
          Much of the time, most of us labor under the illusion that each of us
          understands our desires and interests better than any other party
          could.
       
        whattheheckheck wrote 11 hours 2 min ago:
        Why is lean4 so slow with the main math package
       
        upghost wrote 15 hours 4 min ago:
        So I have been doing formal specification with TLA+ using AI assistance
        and it has been very helpful AFTER I REALIZED that quite often it was
        proving things that were either trivial or irrelevant to the problem at
        hand (and not the problem itself), but difficult to detect at a high
        level.
        
        I realize formal verification with lean is a slightly different game
        but if anyone here has any insight, I tend to be extremely nervous
        about a confidently presented AI "proof" because I am sure that the
        proof is proving whatever it is proving, but it's still very hard for
        me to be confident that it is proving what I need it to prove.
        
        Before the dog piling starts, I'm talking specifically about
        distributed systems scenarios where it is just not possible for a human
        to think through all the combinatorics of the liveness and safety
        properties without proof assistance.
        
        I'm open to being wrong on this, but I think the skill of writing a
        proof and understanding the proof is different than being sure it
        actually proves for all the guarantees you have in mind.
        
        I feel like closing this gap is make it or break it for using AI
        augmented proof assistance.
       
          nextos wrote 8 hours 56 min ago:
          As a heavy user of formal methods, I think refinement types, instead
          of theorem proving with Lean or Isabelle, is both easier and more
          amenable to automation that doesn't get into these pitfalls.
          
          It's less powerful, but easier to break down and align with code.
          Dafny and F* are two good showcases. Less power makes it also faster
          to verify and iterate on.
       
          daxfohl wrote 10 hours 33 min ago:
          Yeah, even for simple things, it's surprisingly hard to write a
          correct spec. Or more to the point, it's surprisingly easy to write
          an incorrect spec and think it's correct, even under scrutiny, and so
          it turns out that you've proved the wrong thing.
          
          There was a post a few months ago demonstrating this for various
          "proved" implementations of leftpad: [1] This isn't to say it's
          useless; sometimes it helps you think about the problem more
          concretely and document it using known standards. But I'm not super
          bullish on "proofs" being the thing that keeps AI in line. First,
          like I said, they're easy to specify incorrectly, and second, they
          become incredibly hard to prove beyond a certain level of complexity.
          But I'll be interested to watch the space evolve.
          
          (Note I'm bullish on AI+Lean for math. It's just the "provably safe
          AI" or "provably correct PRs" that I'm more skeptical of).
          
  HTML    [1]: https://news.ycombinator.com/item?id=45492274
       
            fauigerzigerk wrote 9 hours 58 min ago:
            >But I'm not super bullish on "proofs" being the thing that keeps
            AI in line.
            
            But do we have anything that works better than some form of formal
            specification?
            
            We have to tell the AI what to do and we have to check whether it
            has done that. The only way to achieve that is for a person who
            knows the full context of the business problem and feels a
            social/legal/moral obligation not to cheat to write a formal spec.
       
              daxfohl wrote 8 hours 52 min ago:
              Code review, tests, a planning step to make sure it's approaching
              things the right way, enough experience to understand the right
              size problems to give it, metrics that can detect potential
              problems, etc. Same as with a junior engineer.
              
              If you want something fully automated, then I think more
              investment in automating and improving these capabilities is the
              way to go. If you want something fully automated and 100%
              provably bug free, I just don't think that's ever going to be a
              reality.
              
              Formal specs are cryptic beyond even a small level of complexity,
              so it's hard to tell if you're even proving the right thing. And
              proving that an implementation meets those specs blows up even
              faster, to the point that a lot of stuff ends up being formally
              unprovable. It's also extremely fragile: one line code change or
              a small refactor or optimization can completely invalidate
              hundreds of proofs. AI doesn't change any of that.
              
              So that's why I'm not really bullish on that approach. Maybe
              there will be some very specific cases where it becomes useful,
              but for general business logic, I don't see it having useful
              impact.
       
          youknownothing wrote 12 hours 41 min ago:
          I was having the same intuition, but you verbalised it better: the
          notion of having a definitive yes/no answer is very attractive, but
          describing what you need in such terms using natural language, which
          is inherently ambiguous... that feels like a fool's errand. That's
          why I keep thinking that LLM usage for serious things will break down
          once we get to the truly complicated things: it's non-deterministic
          nature will be an unbreakable barrier. I wish I'm wrong, though.
       
          oggy wrote 13 hours 14 min ago:
          In my experience, finding the "correct" specification for a problem
          is usually very difficult for realistic systems. Generally it's
          unlikely that you'll be able to specify ALL the relevant properties
          formally. I think there's probably some facet of Kolmogorov
          complexity there; some properties probably cannot be significantly
          "compressed" in a way where the specification is significantly
          shorter and clearer than the solution.
          
          But it's still usually possible to distill a few crucial properties
          that can be specified in an "obviously correct" manner. It takes A
          LOT of work (sometimes I'd be stuck for a couple of weeks trying to
          formalize a property). But in my experience the trade off can be
          worth it. One obvious benefit is that bugs can be pricey, depending
          on the system. But another benefit is that, even without formal
          verification, having a few clear properties can make it much easier
          to write a correct system, but crucially also make it easier to
          maintain the system as time goes by.
       
            awesomeMilou wrote 5 hours 59 min ago:
            I'm curious since I'm not a mathematician: What do you mean by
            "stuck for a couple of weeks"? I am trying to practice more
            advanced math and have stumbled over lean and such but I can't
            imagine you just sit around for weeks to ponder over a problem,
            right? What do you do all this time?
       
          johnbender wrote 14 hours 9 min ago:
          You have identified the crux of the problem, just like mathematics
          writing down the “right” theorem is often half or more of the
          difficulty.
          
          In the case of digital systems it can be much worse because we often
          have to include many assumptions to accommodate the complexity of our
          models. To use an example from your context, usually one is required
          to assume some kind of fairness to get anything to go through with
          systems operating concurrently but many kinds of fairness are not
          realistic (eg strong fairness).
       
          esafak wrote 14 hours 17 min ago:
          Could you write a blog post about your experience to make it more
          concrete?
       
        lo_zamoyski wrote 15 hours 26 min ago:
        This has been the approach taken by some using LLMs, even in less
        type-heavy situations. Of course, it is part of a broader tradition in
        which search is combined with verification. Genetic programming and
        related areas come to mind. Here, LLMs are search, while Lean is used
        to express constraints.
       
        SteveJS wrote 18 hours 30 min ago:
        I am using lean as part of the prd.md description handed to a coding
        agent. The definitions in lean compile and mean exactly what I want
        them to say.  The implementation i want to build is in rust.
        
        HOWEVER … I hit something i now call a McLuhen vortex error: “When
        a tool, language, or abstraction smuggles in an implied purpose at odds
        with your intended goal.”
        
        Using Lean implies to the coding agent ‘proven’ is a pervasive
        goal.
        
        I want to use lean to be more articulate about the goal. Instead using
        lean smuggled in a difficult to remove implicit requirement that
        everything everywhere must be proven.
        
        This was obvious because the definitions i made in lean imply the exact
        opposite of everything needs to be proven. When i use morphism i mean
        anything that is a morphism not only things proven to be morphisms.
        
        A coding agent driven by an llm needs a huge amount of structure to use
        what the math says rather than take on the implications that because it
        is using a proof system therefore everything everywhere is better if
        proven.
        
        The initial way i used lean poisoned the satisficing structure that
        unfolds during a coding pass.
       
          mycall wrote 16 hours 25 min ago:
          Could you put that distinction into the AGENTS.md file so it will
          understand and follow that nuance?
       
            SteveJS wrote 15 hours 7 min ago:
            I have several techniques queued up that attempt to counter it. 
            The distinction in Agents.md is definitely part of it.
            
            Not sure if they will work yet.
       
        Rochus wrote 19 hours 9 min ago:
        Interesting. It's essentially the same idea as in this article: [1] .
        In both scenarios, the human is relieved of the burden of writing
        complex formal syntax (whether Event-B or Lean 4). The human specifies
        intent and constraints in natural language, while the LLM handles the
        work of formalization and satisfying the proof engine.
        
        But Lean 4 is significantly more rigid, granular, and foundational than
        e.g. Event-B, and they handle concepts like undefined areas and
        contradictions very differently. While both are "formal methods," they
        were built by different communities for different purposes: Lean is a
        pure mathematician's tool, while Event-B is a systems engineer's tool.
        Event-B is much more flexible, allowing an engineer (or the LLM) to
        sketch the vague, undefined contours of a system and gradually tighten
        the logical constraints through refinement.
        
        LLMs are inherently statistical interpolators. They operate beautifully
        in an Open World (where missing information is just "unknown" and can
        be guessed or left vague) and they use Non-Monotonic Reasoning (where
        new information can invalidate previous conclusions). Lean 4 operates
        strictly on the Closed World Assumption (CWA) and is brutally
        Monotonic. This is why using Lean to model things humans care about
        (business logic, user interfaces, physical environments, dynamic
        regulations) quickly hits a dead end. The physical world is full of
        exceptions, missing data, and contradictions. Lean 4 is essentially a
        return to the rigid, brittle approach of the 1980s expert systems.
        Event-B (or similar methods) provides the logical guardrails, but
        critically, it tolerates under-specification. It doesn't force the LLM
        to solve the Frame Problem or explicitly define the whole universe. It
        just checks the specific boundaries the human cares about.
        
  HTML  [1]: https://substack.com/home/post/p-184486153
       
          sinkasapa wrote 10 hours 27 min ago:
          Lean 4 is uses constructive logic. If a closed world assumption
          requires that a statement that is true is also known to be true, and
          that any statement that is not known to be true is therefore false,
          that is not true of constructive systems. I only use Rocq, but I
          believe the type theories in Rocq and Lean 4 are basically similar
          variations on the Calculus of Constructions in both cases, though
          there are important differences. In a constructive theory something
          is true if a proof can be constructed, but the lack of a proof does
          not entail that something is false. One needs to prove that something
          is false. In constructive type theory, one can say, that something is
          true or false.
       
          skybrian wrote 13 hours 37 min ago:
          I think it’s better to think of an LLM as a very good hint engine.
          It’s good at coming up with more possibilities to consider and less
          good at making sure they work, unless it has an external system to
          test ideas on and is trained to use it. In the case of applied math,
          it’s not enough to prove theorems. It also needs to be testing
          against the real world somehow.
       
          YeGoblynQueenne wrote 14 hours 38 min ago:
          >> LLMs are inherently statistical interpolators. They operate
          beautifully in an Open World (where missing information is just
          "unknown" and can be guessed or left vague) and they use
          Non-Monotonic Reasoning (where new information can invalidate
          previous conclusions).
          
          I think LLM reasoning is not so much non-monotonic as unsound, in the
          sense that conclusions do not necessarily follow from the premises.
          New information may change conclusions but how that happens is
          anyone's guess. There's some scholarship on that, e.g. there's a
          series of papers by Subarao Kamphampathi and his students that show
          how reasoning models' "thiking" tokens don't really correspond to
          sound reasoning chains, even if they seem to improve performance
          overall [1].
          
          But it is difficult to tell what reasoning really means in LLMs. I
          believe the charitable interpretation of claims about LLM reasoning
          is that it is supposed to be informal. There is evidence both for and
          against it (e.g. much testing is in fact on formal reasoning
          problems, like math exam questions or Sokoban, but there's tests of
          informal reasoning also, e.g. on the bar exam). However, different
          interpretations are hard to square with the claims that "we don't
          understand reasoning"; not a direct quote, but I'm aware of many
          claims like that by people whose job it is to develop LLMs and that
          were made at the height of activity around reasoning models (which
          seems now to have been superseded by activity around "world models")
          [1].
          
          If LLMs are really capable of informal reasoning (I'm not necessarily
          opposed to that idea) then we really don't understand what that
          reasoning is, but it seems we're a bit stuck because to really
          understand it, we have to, well, formalise it.
          
          That said, non-monotonic reasoning is supposed to be closer to the
          way humans do informal reasoning in the real world, compared to
          classical logic, even though classical logic started entirely as an
          effort to formalise human reasoning; I mean, with Aristotle's
          Syllogisms (literally "rsasonings" in Greek).
          
          ________________
          
          [1] Happy to get links if needed.
       
            whattheheckheck wrote 10 hours 57 min ago:
            Reasoning is a pattern that is embedded within the token patterns
            but the llms are imitating reasoning via learning symbolic
            reasoning patterns.
            
            The very fact that it memorized the Ceasar cipher rot13 pattern is
            due to it being a Linux command and it had examples of patterns of
            13 shifted letters. If you asked it to figure out a different shift
            it struggled.
            
            Now compound that across all intelligent reasoning problems in the
            entirety of human existence and you'll see how we will never have
            enough data to make agi with this architecture and training
            paradigm.
            
            But we will have higher and higher fidelity maps of symbolic
            reasoning patterns as they suck up all the agent usage data for
            knowledge work tasks. Hopefully your tasks fall out of distribution
            of the median training data scope
       
            Rochus wrote 14 hours 18 min ago:
            My claim was not that an LLM was a formal, mathematically sound
            non-monotonic logic engine, but that the problem space is
            "non-monotonic" and "open world". The fact that an LLM is "unsound"
            and "informal" is the exact reason why my approach is necessary.
            Because LLMs are unsound, informal, and probabilistic, as you say,
            forcing them to interface with Lean 4 is a disaster. Lean 4 demands
            100% mathematical soundness, totality, and closed-world perfection
            at every step. An LLM will just hit a brick wall. Methods like
            Event-B (which I suggest in my article), however, are designed to
            tolerate under-specification. It allows the LLM to provide an
            "unsound" or incomplete sketch, and uses the Proof Obligations to
            guide the LLM into a sound state via refinement.
       
          mycall wrote 16 hours 19 min ago:
          So basically you are arguing a Type Theory vs Set Theory problem,
          Foundationalism or Engineering Refinement.  Since we read here of
          multiple use cases for LLMs in both CS divides, we can conclude an
          eventual convergence in these given approaches; and if not that, some
          formal principles should emerge of when to use what.
       
            jojomodding wrote 7 hours 39 min ago:
            Both type and set theory are formal logic, I don't see how that's
            what being argued. Rather that there are some things that are
            formal-logicy (e.g. set theory) and many other things that are not
            (like e.g. biology, you'll always find some weird organism breaking
            your assumptions).
       
            Rochus wrote 15 hours 43 min ago:
            This discussion started already in the sixties (see e.g. the 1969
            publication by McCarthy and Hayes where they describe the "frame
            problem" as a fundamental obstacle to the attempt to model the
            dynamic world using First-Order Logic and monotonic reasoning). A
            popular attempt to "solve" this problem is the Cyc project.
            Monotonic logic is universally understood as a special, restricted
            case (a subset) of a broader non-monotonic theory.
       
              mycall wrote 15 hours 9 min ago:
              I'm familiar with Cyc but never considered it a monotonic
              reasoning, but it definitely makes sense in retrospect. It
              appears Lean Machines [0] is a step head, combining both sides of
              the frame problem as a specific, although it likely leans towards
              leans (pun intended).
              
              [0]
              
  HTML        [1]: https://github.com/lean-machines-central/lean-machines
       
                Rochus wrote 14 hours 1 min ago:
                Thanks for the hint. The "LeanMachines" project literally seems
                to recreate Event-B constructs (contexts, machines, events, and
                refinement proof obligations) inside the Lean 4 proof assistant
                (using Lean 4 as a "host language").
       
        tokenless wrote 19 hours 23 min ago:
        > Large language models (LLMs) have astounded the world with their
        capabilities, yet they remain plagued by unpredictability and
        hallucinations – confidently outputting incorrect information. In
        high-stakes domains like finance, medicine or autonomous systems, such
        unreliability is unacceptable.
        
        This misses a point that software engineers initmately know especially
        ones using ai tools:
        
        * Proofs are one QA tool
        
        * Unit tests, integration tests and browser automation are other tools.
        
        * Your code can have bugs because it fails a test above BUT...
        
        * You may have got the requirements wrong!
        
        Working with claude code you can have productive loops getting it to
        assist you in writing tests, finding bugs you hadn't spotted and
        generally hardening your code.
        
        It takes taste and dev experience definitely helps (as of Jan 26)
        
        So I think hallucinations and proofs as the fix is a bit barking up the
        wrong tree
        
        The solution to hallucinations is careful shaping of the agent
        environment around the project to ensure quality.
        
        Proofs may be part of the qa toolkit for AI coded projects but probably
        rarely.
       
        kig wrote 19 hours 26 min ago:
        If you want to mess with this at home, I've been vibe coding [1] to
        plug theorem provers into an LLM call loop. It's pretty early dev but
        it does have a logic rap battle mode.
        
  HTML  [1]: https://github.com/kig/formalanswer
       
          nl wrote 16 hours 43 min ago:
          This is pretty interesting!
       
        Gehinnn wrote 19 hours 28 min ago:
        I just completed the formal verification of my bachelor thesis about
        real time cellular automata with Lean 4, with heavy use of AI.
        
        Over the past year, I went from fully manual mode (occasionally asking
        chat gpt some Lean questions) to fully automatic mode, where I barely
        do Lean proofs myself now (and just point AI to the original .tex
        files, in German).
        It is hard to believe how much the models and agentic harnesses
        improved over the last year.
        
        I cannot describe how much fun it is to do refactorings with AI on a
        verified Lean project!
        
        Also, it's so easy now to have visualizations and typesetted documents
        generated by AI, from dependency visualizations of proofs using the
        Lean reflection API, to visual execution traces of cellular automatas.
       
          svara wrote 15 hours 58 min ago:
          Can you give some examples of this? Maybe have something online? I
          would love to learn more about how to do proof driven AI assisted
          development.
       
            Gehinnn wrote 15 hours 27 min ago:
            Here is a session that I just had with AI: [1] (summarized by AI).
            
            And here are some examples of the different philosophies of AI
            proofs and human proofs: [1] I use VS Code in a beefy Codespace,
            with GitHub Copilot (Opus 4.5).
            I have a single instruction file telling the AI to always run "lake
            build ./lean-file.lean" to get
             feedback.
            
            (disclaimer: I work on VS Code)
            
  HTML      [1]: https://gist.github.com/hediet/e3569a7c6b4b7c4f7d4a7db4101...
  HTML      [2]: https://gist.github.com/hediet/e3569a7c6b4b7c4f7d4a7db4101...
       
            nwyin wrote 15 hours 32 min ago:
            it's a bit dated, but Terence Tao has a video of formalizing a
            proof with LLMs from 9 months ago which should be illuminating
            
  HTML      [1]: https://youtu.be/zZr54G7ec7A?si=-l3jIZZzfghoqJtq
       
              Gehinnn wrote 15 hours 23 min ago:
              This is very similar to how I worked with Lean a year ago (of
              course in a much simpler domain) - mostly manual editing,
              sometimes accepting an inline completion or next edit suggestion.
              However, with agentic AI that can run lean via CLI my workflow
              changed completely and I rarely write full proofs anymore (only
              intermediate lemma statements or very high level calc
              statements).
       
                pvillano wrote 10 hours 41 min ago:
                Do lean poofs need to be manually reviewed?
                
                Or is it as long as you formalize your theorem correctly, a
                valid lean program is an academically useful proof?
                
                Are there any minimal examples of programs which claim to prove
                the thing without actually proving the thing in a meaningful
                way?
       
                  Gehinnn wrote 6 hours 29 min ago:
                  There have been bugs in Lean that allowed people to prove
                  False, from which you can prove anything (they have been
                  fixed).
                  
                  Otherwise, if you check that no custom axiom has been used
                  (via print axioms), the proof is valid.
                  
                  It's easy to construct such an example: Prove that for all a,
                  b, c and n between 3 and 10^5, a^n=b^n+c^n has no solution.
                  The unmeaningful proof would enumerate all ~10^20 cases and
                  proof them individually. The meaningful (and probably even
                  shorter) proof would derive this from Fermat's theorem after
                  proving that one.
       
        xvilka wrote 20 hours 14 min ago:
        Lean is a great idea, especially the 4th version, a huge level up from
        the 3rd one, but its core still deficient[1] in some particular
        scenarious (see an interesting discussion[2] in the Rock (formerly Coq)
        issue tracker). Not sure if it might hinder the automation  with the
        AI. [1]
        
  HTML  [1]: https://artagnon.com/logic/leancoq
  HTML  [2]: https://github.com/rocq-prover/rocq/issues/10871
       
          joomy wrote 17 hours 43 min ago:
          The issue was a fun read, thanks for sharing.
       
        zmgsabst wrote 20 hours 24 min ago:
        The real value is in mixed mode:
        
        - Lean supports calling out as a tactic, allowing you to call LLMs or
        other AI as judges (ie, they return a judgment about a claim)
        
        - Lean can combine these judgments from external systems according to
        formal theories (ie, normal proof mechanics)
        
        - an LLM engaged in higher order reasoning can decompose its thinking
        into such logical steps of fuzzy blocks
        
        - this can be done recursively, eg, having a step that replaces LLM
        judgments with further logical formulations of fuzzy judgments from the
        LLM
        
        Something, something, sheaves.
       
        nudpiedo wrote 21 hours 6 min ago:
        I like a lot of the idea behind such theorem provers, however, I always
        have issues with them producing compatible code  with other languages.
        
        This happened to me with idris and many others, I took some time to
        learn the basics, wrote some examples and then FFI was a joke or code
        generators for JavaScript absolutely useless.
        
        So no way of leveraging an existing ecosystem.
       
          densh wrote 18 hours 43 min ago:
          Apart from prioritizing FFI (like Java/Scala, Erlang/Elixir), the
          other two easy ways to bootstrap an integration of a new obscure or
          relatively new programming language is to focus on RPC (ffi through
          network) or file input-output (parse and produce well known file
          formats to integrate with other tools at Bash level).
          
          I find it very surprising that nobody tried to make something like
          gRPC as an interop story for a new language, with an easy way to
          write impure "extensions" in other languages and let your
          pure/formal/dependently typed language implement the rest purely
          through immutable message passing over gRPC boundary. Want file i/o?
          Implement gRPC endpoint in Go, and let your language send read/write
          messages to it without having to deal with antiquated and memory
          unsafe Posix layer.
       
          seanhunter wrote 20 hours 59 min ago:
          Lean has standard c ABI FFI support.
          
  HTML    [1]: https://lean-lang.org/doc/reference/latest/Run-Time-Code/For...
       
            nudpiedo wrote 20 hours 18 min ago:
            Literally the first line of the link:
            
            “The current interface was designed for internal use in Lean and
            should be considered unstable. It will be refined and extended in
            the future.“
            
            My point is that in order to use these problem provers you really
            gotta be sure you need them, otherwise interaction with an external
            ecosystem might be a dep/compilation nightmare or bridge over tcp
            just to use libraries.
       
        throwaway2027 wrote 21 hours 23 min ago:
        I think I saw Terence Tao use a formal proof language but I don't
        remember if it was Lean. I'm not familiar with it but I do agree that
        moving to provable languages could improve AI but isn't the basis just
        having some immutable rigorous set of tests basically which could be
        replicated in "regular" programming languages?
       
          anon291 wrote 15 hours 16 min ago:
          A theorem prover is a dependently typed functional programming
          language. If you can generate a term with a particular type then the
          theorem is true. There is no testing involved.
       
            nextos wrote 1 hour 52 min ago:
            There are many classical theorem provers that use simple type
            systems, e.g. Isabelle. Mizar is even weakly typed.
       
            cess11 wrote 14 hours 20 min ago:
            I'm not so sure, because Prolog.
       
              anon291 wrote 7 hours 0 min ago:
              Prolog is not a theorem prover. Theorem provers are  total (I.e.
              not turing complete)
       
          gaogao wrote 20 hours 53 min ago:
          Yes, though often the easiest way to replicate it in regular
          programming languages is to translate that language to Lean or
          another ITM, though auto-active like Verus is used for Rust pretty
          successfully.
          
          Python and C though have enough nasal demons and undefined behavior
          that it's a huge pain to verify things about them, since some random
          other thread can drive by and modify memory in another thread.
       
          iNic wrote 20 hours 58 min ago:
          You can think of theorem provers as really crazy type checkers. It's
          not just a handful of tests that have to run, but more like a program
          that has to compile.
       
            seanhunter wrote 20 hours 44 min ago:
            Yes exactly.  There is this thing called the “Curry-Howard
            Isomorphism” which (as I understand it) says that propositions in
            formal logic are isomorphic to types.  So the “calculus of
            constructions” is a typed lambda calculus based on this that
            makes it possible for you to state some proposition as a type and
            if you can instantiate that type then what you have done is
            isomorphic to proving the proposition.    Most proof assistants (and
            certainly Lean) are based on this.
            
            So although lean4 is a programming language that people can use to
            write “normal” programs, when you use it as a proof assistant
            this is what you are doing - stating propositions and then using a
            combination of a (very extensive) library of previous results, your
            own ingenuity using the builtins of the language and (in my
            experience anyway) a bunch of brute force to instantiate the type
            thus proving the proposition.
       
              lo_zamoyski wrote 15 hours 8 min ago:
              Technically, it isn't an isomorphism (the word is abused very
              often), and there is no fixed, general syntactic correspondence.
              However, in the case of Lean, we can establish a correspondence
              between its dependent type system and intuitionistic higher-order
              predicate logic.
       
          seanhunter wrote 21 hours 1 min ago:
          It was lean4.  In fact he has made lean4 versions of all of the
          proofs in his Analysis I textbook available here [1] He also has
          blogged about how he uses lean for his research.
          
          Edit to add: Looking at that repo, one thing I like (but others may
          find infuriating idk) is that where in the text he leaves certain
          proofs as exercises for the reader, in the repo he turns those into
          “sorry”s, so you can fork the repo and have a go at proving those
          things in lean yourself.
          
          If you have some proposition which you need to use as the basis of
          further work but you haven’t completed a formal proof of yet, in
          lean, you can just state the proposition with the proof being
          “sorry”.  Lean will then proceed as though that proposition had
          been proved except that it will give you a warning saying that you
          have a sorry.  For something to be proved in lean you have to have it
          done without any “sorry”s.
          
  HTML    [1]: https://github.com/teorth/analysis
  HTML    [2]: https://lean-lang.org/doc/reference/latest/Tactic-Proofs/Tac...
       
       
   DIR <- back to front page