More fixes

Fixes
doc: standard library design principles
2026-04-08 21:24:09 +00:00 · 2025-02-06 14:14:31 +01:00 · 2025-02-06 14:00:24 +01:00 · 2025-02-06 13:57:15 +01:00
3 changed files with 97 additions and 4 deletions
--- a/doc/std/README.md
+++ b/doc/std/README.md
@@ -5,5 +5,6 @@ is part of the [Lean Language Reference](https://lean-lang.org/doc/reference/lat

 Here you will find
 * the [standard library vision document](./vision.md), including the call for contributions,
-* the [standard library style guide](./style.md), and
-* the [standard library naming conventions](./naming.md).
+* the [standard library style guide](./style.md),
+* the [standard library naming conventions](./naming.md), and
+* the [standard library API design principles and best practices](./design.md).
--- a/doc/std/design.md
+++ b/doc/std/design.md
@@ -0,0 +1,92 @@
+# Standard library API design principles and best practices
+
+This document collects loosely organized notes on how to design APIs for the Lean standard library. It supplements the [style guide](./style.md) and [naming scheme](./naming.md).
+It will also serve as a place for the standard library team to record decisions for future reference and to promote consistency across the library.
+
+## Binder types
+
+### Heuristics for selecting implicit variables
+
+*TODO*
+
+### Binder types for data-carrying typeclasses
+
+Users will occasionally attempt constructions such as
+
+```lean
+import Std.Data.HashMap
+
+open Std (HashMap)
+
+structure A (α) extends BEq α, Hashable α where
+  foo : HashMap α Nat
+
+variable [BEq α]
+
+def A.add (xs : A α) (x : α) : A α := { xs with foo := xs.foo.insert x 5 }
+```
+
+For this to work, the `HashMap.insert` function needs to resolve `BEq α` via unification rather than typeclass synthesis.
+
+So we formulate the following principle:
+
+> **If a data type `α` depends on a `Type`-valued type class, then all functions and theorems which have a parameter of type
+> `α` should take the type class as an implicit argument.**
+
+## Nested inductive types
+
+Nested inductive datatypes are indispensible in many applications, so we formulate the following principle:
+
+> **Data types which cannot be used in nested inductive types must come with variants which can.**
+
+## Executable definitions must be fast or marked as not intended for execution
+
+It is not unusual to write definitions in Lean that are purely for verification purposes and not intended to be executed, even though they are perfectly executable from the
+compiler's perspective. To avoid unwelcome surprises, the have the following guideline:
+
+> **All public definitions must be computationally efficient when used in compiled code (potentially via `@[csimp]`), unless
+> explicitly marked as not intended for execution in a documentation string (or, even better, through a suggestive name and a documentation string).**
+
+## Orphan rule for instances and attributes
+
+We adopt the following orphan rule for instances and attributes:
+
+> **If `S` is a set of standard library modules such that importing `S` imports both a type `α` and a typeclass `T`, then importing `S` must also import all instances of `T α` that are available in the standard library. Similarly, if `S` is a set of standard library modules such that importing `S` imports both a type `α` and an attribute `a`, then `S` must also import all occurrences of `attribute [a] α` in the standard library.**
+
+This is slightly weaker than "every instance for a type must be provided in the file that defines the type or in the file that defines the class" because it allows developing a concept internally before exposing everything at once in a public file.
+
+The purpose of this rule is to eliminate instances where the answer to the question "why does this not work?" is "you forgot to import a certain file", and also to eliminate cases where just importing additional modules breaks downstream code ("spooky action at a distance").
+
+## Converting between types
+
+### Conventions for `X.toY` and `Y.ofX` functions
+
+*TODO*
+
+### When to introduce coercions
+
+*TODO*
+
+## Proofs in theorem statements
+
+If a function with a proof argument (like `Option.get`) appears on the RHS of a lemma, then rewriting with that lemma may introduce the proof to the goal state without the user
+having easy access to it. Since further tactic manipulation can lead to the user having to reprove the claim thus introduced, we introduce the following guideline:
+
+> **Proofs that appear in theorem statements should be a hypothesis, single lemma application, single tactic application or very short term proof.**
+
+If a more complex proof is required, it should be factored out into a reusable lemma that is available to users.
+
+## Order of equality statements
+
+Whenever `=` or `==` appears in a statement, there is a choice as to how to order the arguments. We adopt the following convention:
+
+> **Put the "more constant" variable on the right-hand side of a comparison statement.**
+
+The idea behind this convention is to be consistent with intuitive usage (for example, most people are more likely to write `l.find? (· == 1)` than `l.find? (1 == ·))`).
+
+This extends to the case where both arguments are variables, by deciding that a search key is "more constant" than the thing being searched, as in the following example:
+
+```lean
+theorem getElem?_insert [EquivBEq α] [LawfulHashable α] {m : HashMap α β} {k a : α} {v : β} :
+    (m.insert k v)[a]? = if k == a then some v else m[a]?
+```
--- a/doc/std/naming.md
+++ b/doc/std/naming.md
@@ -57,13 +57,13 @@ for deciding where to place a theorem, and is, on occasion, a good reason to dup
 New types that are added will usually be placed in the `Std` namespace and in the `Std/` source directory, unless there are good reasons to place
 them elsewhere.

-Inside the  `Std`, all internal declarations should be `private` or else have a name component that clearly marks them as internal, preferably
+Inside the `Std` namespace, all internal declarations should be `private` or else have a name component that clearly marks them as internal, preferably
 `Internal`.


 ## Naming convention for data

-When defining data, i.e., a (possibly 0-ary) function whose codomain is not Sort u, but has type Type u for some u, it should be named in lowerCamelCase. Examples include List.append and List.isPrefixOf.
+When defining data, i.e., a (possibly 0-ary) function whose codomain is not `Sort u`, but has type `Type u` for some `u`, it should be named in lowerCamelCase. Examples include `List.append` and `List.isPrefixOf`.
 If your data is morally fully specified by its type, then use the naming procedure for theorems described below and convert the result to lower camel case.

 If your function returns an `Option`, consider adding `?` as a suffix. If your function may panic, consider adding `!` as a suffix. In many cases, there will be multiple variants of a function; one returning an option, one that may panic and possibly one that takes a proof argument.
Author	SHA1	Message	Date
Markus Himmel	56eadfdcb1	More fixes	2025-02-06 14:14:31 +01:00
Markus Himmel	c0c886817b	Fixes	2025-02-06 14:00:24 +01:00
Markus Himmel	0b56d3c3c5	doc: standard library design principles	2025-02-06 13:57:15 +01:00