Skip to content

Conversation

@fmontesi
Copy link
Collaborator

@fmontesi fmontesi commented Nov 7, 2025

This PR merges the developments that @ctchou and I carried out in the automata branch, #141, and #142, as discussed on Zulip (#CSLib > Question Lean structure and extends). If this is merged, the automata branch and the two mentioned PRs should be erased.

The design based on Acceptor and structure extension works pretty well. Many definitions could be removed because of the Acceptor typeclass, and some proofs got easier thanks to the combination of structure extension, record updates, and grind.

Co-authored with @ctchou.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@fmontesi
Copy link
Collaborator Author

fmontesi commented Nov 7, 2025

bot fix style

@chenson2018
Copy link
Collaborator

bot fix style

Does this work??? Very cool if so. I was surprised to see the bot commenting at all actually, it had been silently failing to post recently... did you do something to fix it?

Copy link
Collaborator

@chenson2018 chenson2018 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small style suggestions. The design seems fine to me, but I think you are more in touch with the requirements than I am.

namespace Cslib.Automata

/-- An `Acceptor` is a machine that recognises strings (lists of symbols in an alphabet). -/
class Acceptor (α : Type _) (Symbol : outParam (Type _)) where
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mildly jarring to half follow the Mathlib convention of lowercase Greek for type variables.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was in doubt myself, it gives some goosebumps. But I couldn't come up with a better name than alpha.. maybe A for automaton or M for machine?

rw [IsRegular.iff_cslib_dfa] at h ⊢
obtain ⟨State, _, ⟨da, acc⟩, rfl⟩ := h
use State, inferInstance, ⟨da, accᶜ⟩
ext ; grind
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ext ; grind
grind

@ctchou
Copy link
Contributor

ctchou commented Nov 7, 2025

Can we not use "Finite" in DA.Finite and NA.Finite? This conflicts with the use of "Finite" in Finite T (where T is a type), which also appears in our code. The "Finite" in {DA,NA}.Finite refers to the finite-ness of the accepted run, while the "Finite" in Finite T refers to the finite-ness of the cardinality of T. Now we end up having expressions like this:

theorem IsRegular.iff_cslib_dfa {l : Language Symbol} : l.IsRegular ↔ ∃ State : Type, ∃ _ : Finite State, ∃ dfa : Cslib.Automata.DA.Finite State Symbol, Cslib.Automata.Acceptor.language dfa = l := by

in which the two "Finite"s mean totally different things. Even more confusing is that "finite automata" is a well-established terminology where the "finite" means that at least the state space is finite and perhaps the alphabet is finite as well.

That is why I chose the name FinAcc in my PR. FinRun is another possible name.

See the 1st commit of #145:
99fb99e
which renames {DA,NA}.Finite to {DA,NA}.FinAcc.

@ctchou
Copy link
Contributor

ctchou commented Nov 7, 2025

I suggest we merge Acceptor.lean and OmegaAcceptor.lean into a single Accept.lean. They are imported together by both DA.lean and NA.lean, which are at the bottom of the import hierarchy. anyway There is no point splitting them.

@fmontesi
Copy link
Collaborator Author

fmontesi commented Nov 7, 2025

Re FinAcc: agreed, I'll switch.

Re Accept: I need to think about it a bit. My thinking was that they're conceptually separate. Maybe we'll use Acceptor somewhere else, or maybe the different substructures of DA and NA will get bigger and merit their own files.

Re all the patches by @chenson2018 : will integrate.

@ctchou
Copy link
Contributor

ctchou commented Nov 7, 2025

For the Finite -> FinAcc renaming, I've already made a patch in cslib#145 at commit: 99fb99e

fmontesi and others added 4 commits November 7, 2025 19:56
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@fmontesi
Copy link
Collaborator Author

fmontesi commented Nov 7, 2025

For the Finite -> FinAcc renaming, I've already made a patch in cslib#145 at commit: 99fb99e

That's great, thanks. I'll merge it as soon as I'm done applying all the other suggestions.

@ctchou
Copy link
Contributor

ctchou commented Nov 7, 2025

In the commit:
4c1d61f
I added the attributes language and mem_language to {DA,NA}.FinAcc so that we can write {da,na}.language rather than having to explicitly mention Acceptor. For example, RegularLanguage.lean is improved as follows:

`
@@ -26,7 +25,7 @@ variable {Symbol : Type*}
/-- A characterization of Language.IsRegular using Cslib.DA -/
theorem IsRegular.iff_cslib_dfa {l : Language Symbol} :
l.IsRegular ↔ ∃ State : Type, ∃ _ : Finite State,

  •  ∃ dfa : Cslib.Automata.DA.FinAcc State Symbol, Cslib.Automata.Acceptor.language dfa = l := by
    
  •  ∃ dfa : Cslib.Automata.DA.FinAcc State Symbol, dfa.language = l := by
    
    constructor
    · rintro ⟨State, h_fin, ⟨tr, start, acc⟩, rfl⟩
    let dfa := Cslib.Automata.DA.FinAcc.mk {tr, start} acc
    @@ -40,7 +39,7 @@ theorem IsRegular.iff_cslib_dfa {l : Language Symbol} :
    /-- A characterization of Language.IsRegular using Cslib.NA -/
    theorem IsRegular.iff_cslib_nfa {l : Language Symbol} :
    l.IsRegular ↔ ∃ State : Type, ∃ _ : Finite State,
  •  ∃ nfa : Cslib.Automata.NA.FinAcc State Symbol, Cslib.Automata.Acceptor.language nfa = l := by
    
  •  ∃ nfa : Cslib.Automata.NA.FinAcc State Symbol, nfa.language = l := by
    
    rw [IsRegular.iff_cslib_dfa]; constructor
    · rintro ⟨State, h_fin, ⟨da, acc⟩, rfl⟩
    use State, h_fin, ⟨da.toNA, acc⟩
    `

If you like this, I can make similar changes to {DA,NA}.{Buchi,Muller} as well.

@fmontesi
Copy link
Collaborator Author

fmontesi commented Nov 7, 2025

I thought and tried to do the same. My first impression was that we get better ergonomics with making them abbrev.

But I chose to avoid it for now because it adds redundant definitions to every automaton, which kinda kills the point of having a typeclass. There were discussions about supporting dot notation for instances of a class, which would solve our problem. I'd like to take this discussion up with the Lean developers first. (I'll do that now.)

So I'm just gonna cherry-pick 99fb99e for now (the FinAcc patch). We can always revisit this later.

@fmontesi fmontesi requested a review from chenson2018 November 7, 2025 20:17
@fmontesi
Copy link
Collaborator Author

fmontesi commented Nov 7, 2025

All done.

I've been thinking and searching more about the dot notation thing from the typeclass: if we really want to use dot notation for language, Accepts, etc., couldn't we make a macro to make all these thin abbreviations?

I'm not sure we should though.. we should ask the mathlibbers about their experience with this.

@ctchou
Copy link
Contributor

ctchou commented Nov 7, 2025

In my opinion, such "redundant" definitions and theorems are very much in the style of Lean mathlib. For example, take a look at:
https://leanprover-community.github.io/mathlib4_docs/Mathlib/Computability/Language.html
One could argue that all the mem_... theorems there are "redundant", for they are nothing but a slight re-statement of the definitions. In fact, all the ..._defs are actually theorems, not the real definitions, which are mostly instances very much like in our case.

More importantly, don't you find an expression like:
∃ dfa : Cslib.Automata.DA.FinAcc State Symbol, Cslib.Automata.Acceptor.language dfa = l
just intolerable? One advantage of bundling the accepting states in the acceptor is that we don't have to mention and quantifying over them explicitly in a theorem like the one above. If we have to write Cslib.Automata.Acceptor.language dfs, what advantage have we got?

@ctchou
Copy link
Contributor

ctchou commented Nov 7, 2025

Here's an example from mathlib's Language:
`
instance : Mul (Language α) :=
⟨image2 (· ++ ·)⟩

theorem mul_def (l m : Language α) : l * m = image2 (· ++ ·) l m :=
rfl

theorem mem_mul : x ∈ l * m ↔ ∃ a ∈ l, ∃ b ∈ m, a ++ b = x :=
mem_image2
`
The {DA.NA}.FinAcc.{language,mem_language} follow exactly the same pattern.

@fmontesi
Copy link
Collaborator Author

fmontesi commented Nov 8, 2025

Not all of these examples are direct copies of the definitions/theorems in the class, but that's beyond the point: as I wrote, my problem is not with having the copied definitions for dot notation (btw, these definitions should always just be a reference to the class definitions/theorems), but more with doing so manually. I would just much rather explore a systematic way, e.g., supplying a simple annotation when one creates an instance that produces all the copies needed for dot notation automatically. I'm now discussing this on Zulip.

@fmontesi
Copy link
Collaborator Author

fmontesi commented Nov 8, 2025

Link to the Zulip discussion: #Is there code for X? > Deriving dot notation from class instances

@ctchou
Copy link
Contributor

ctchou commented Nov 8, 2025

Another response to the problem we are facing in this PR is to ask: Why is having an "acceptor" class preferable to the last design in #142, which doesn't have an "acceptor" class? What exactly is being gained from having an "acceptor" class?

BTW, this brings up another question: Why do you need two acceptor classes? Acceptor and OmegaAcceptor are identical, except for the some types. If you want to abstract the notion of "accepting" into an "acceptor" class, why not go all the way and have a single "acceptor" class and two different instantiations of it?

@fmontesi
Copy link
Collaborator Author

fmontesi commented Nov 8, 2025

  1. Code deduplication.
  2. Right you are, I could define
/-- An `Acceptor` is a machine that recognises strings (lists of symbols in an alphabet). -/
class Acceptor (α : Type _) (β : outParam (Type _)) where
  /-- Predicate that establishes whether a string `xs` is accepted. -/
  Accepts (a : α) (b : β) : Prop

Then we'd have things like:

instance : Acceptor (Buchi State Symbol) (ωSequence Symbol) where
  Accepts (a : Buchi State Symbol) (xs : ωSequence Symbol) := ∃ᶠ k in atTop, a.run xs k ∈ a.accept

which looks totally fine to me. I'll get to it, that's even more deduplication. :-)

@ctchou
Copy link
Contributor

ctchou commented Nov 8, 2025

Do we really get code deduplication? If I still have to explicitly make instances of language and mem_language in the style of the Mul example from Language, what code deduplication do I get? I asked the question on Zulip too:
https://leanprover.zulipchat.com/#narrow/channel/217875-Is-there-code-for-X.3F/topic/Deriving.20dot.20notation.20from.20class.20instances/near/554501460

@ctchou
Copy link
Contributor

ctchou commented Nov 8, 2025

Note that the Mul example from Language actually gets code deduplication, because there are many theorems already proved about image2 which can now be instantiated to produce theorems about Mul. (This actually happens in Language.) But do we have any analogues here? Are there nontrivial abstract theorems about Acceptor that can be proved and re-used by its instances? This seems to me rather unlikely, because Acceptor contains only a predicate, a set defined by the predicate, and a theorem relating the predicate and the set. What sophisticated theorems can be proved about them that are not already in mathlib?

@fmontesi
Copy link
Collaborator Author

fmontesi commented Nov 9, 2025

  1. Code deduplication.
  2. Right you are, I could define
/-- An `Acceptor` is a machine that recognises strings (lists of symbols in an alphabet). -/
class Acceptor (α : Type _) (β : outParam (Type _)) where
  /-- Predicate that establishes whether a string `xs` is accepted. -/
  Accepts (a : α) (b : β) : Prop

Then we'd have things like:

instance : Acceptor (Buchi State Symbol) (ωSequence Symbol) where
  Accepts (a : Buchi State Symbol) (xs : ωSequence Symbol) := ∃ᶠ k in atTop, a.run xs k ∈ a.accept

which looks totally fine to me. I'll get to it, that's even more deduplication. :-)

Mmh actually doesn't work quite as expected, as Acceptor.language returns a Language whereas OmegaAcceptor.language returns an OmegaLanguage... I'll keep them separate for the time being.

@ctchou
Copy link
Contributor

ctchou commented Nov 9, 2025

@fmontesi fmontesi deleted the automata branch November 17, 2025 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants