A theorem prover and a counterexample searcher — two complementary AI pipelines for attacking open mathematical problems.
Decomposes a conjecture into a DAG of lemmas. Proves each lemma via recursive decomposition with backtracking. Each proof is independently audited by GPT and Claude.
Designs a computational search algorithm to find counterexamples. Three independent reviewers stress-test the algorithm. Feasibility gate prevents outputting unrunnable code.
GPT + Claude dual provider. API or CLI mode. DuckDuckGo literature search. Per-phase checkpointing with resume. Full I/O logging.
Web search for known approaches (Phase 0), structured into categories (Phase 0.5). GPT decomposes the conjecture into a DAG of lemmas with explicit dependencies (Phase 1). Lemmas at the same dependency level are proved in parallel.
Each lemma: decide "direct proof" or "decompose into sub-lemmas." Decomposition generates 3 strategies; if one fails, backtrack and try the next.
GPT self-audits for internal consistency; Claude cross-audits independently. Both must accept. Failed audits trigger retry with feedback.
Searches for known bounds, prior computational attempts, and construction techniques. Derives mathematical constraints any valid counterexample must satisfy.
Three independent reviewers (correctness, optimality, adversarial) run sequentially. Each can trigger an algorithm revision — the version increments until all pass.
Estimates runtime and search space size. If infeasible, redesigns the algorithm and re-estimates — prevents outputting code that can't finish.