fix: remove accidental type monomorphism in Id.run_seqLeft (#12936 )

This PR fixes `Id.run_seqLeft` and `Id.run_seqRight` to apply when the two monad results are different.
fix: add missing pp-spaces in grind_pattern (#11686 )
2026-03-17 18:34:06 +00:00 · 2026-03-17 06:43:51 +00:00 · 2026-03-17 04:15:02 +00:00 · 2026-03-17 04:14:37 +00:00 · 2026-03-17 03:55:21 +00:00 · 2026-03-17 02:37:55 +00:00
8439 changed files with 62227 additions and 14281 deletions
--- a/.claude/CLAUDE.md
+++ b/.claude/CLAUDE.md
@@ -1,29 +1,42 @@
-To build Lean you should use `make -j -C build/release`.
+(In the following, use `sysctl -n hw.logicalcpu` instead of `nproc` on macOS)
+
+To build Lean you should use `make -j$(nproc) -C build/release`.

 ## Running Tests

-See `doc/dev/testing.md` for full documentation. Quick reference:
+See `tests/README.md` for full documentation. Quick reference:

 ```bash
 # Full test suite (use after builds to verify correctness)
-make -j -C build/release test ARGS="-j$(nproc)"
+CTEST_PARALLEL_LEVEL="$(nproc)" CTEST_OUTPUT_ON_FAILURE=1 \
+make -C build/release -j "$(nproc)" test

 # Specific test by name (supports regex via ctest -R)
-make -j -C build/release test ARGS='-R grind_ematch --output-on-failure'
+CTEST_PARALLEL_LEVEL="$(nproc)" CTEST_OUTPUT_ON_FAILURE=1 \
+make -C build/release -j "$(nproc)" test ARGS='-R grind_ematch'

 # Rerun only previously failed tests
-make -j -C build/release test ARGS='--rerun-failed --output-on-failure'
+CTEST_PARALLEL_LEVEL="$(nproc)" CTEST_OUTPUT_ON_FAILURE=1 \
+make -C build/release -j "$(nproc)" test ARGS='--rerun-failed'

-# Single test from tests/lean/run/ (quick check during development)
-cd tests/lean/run && ./test_single.sh example_test.lean
-
-# ctest directly (from stage1 build dir)
-cd build/release/stage1 && ctest -j$(nproc) --output-on-failure --timeout 300
+# Single test from tests/foo/bar/ (quick check during development)
+CTEST_PARALLEL_LEVEL="$(nproc)" CTEST_OUTPUT_ON_FAILURE=1 \
+make -C build/release -j "$(nproc)" test ARGS=-R testname'
 ```

-The full test suite includes `tests/lean/`, `tests/lean/run/`, `tests/lean/interactive/`,
-`tests/compiler/`, `tests/pkg/`, Lake tests, and more. Using `make test` or `ctest` runs
-all of them; `test_single.sh` in `tests/lean/run/` only covers that one directory.
+## Testing stage 2
+
+When requested to test stage 2, build it as follows:
+```
+make -C build/release stage2 -j$(nproc)
+```
+Stage 2 is *not* automatically invalidated by changes to `src/` which allows for faster iteration
+when fixing a specific file in the stage 2 build but for invalidating any files that already passed
+the stage 2 build as well as for final validation,
+```
+make -C build/release/stage2 clean-stdlib
+```
+must be run manually before building.

 ## New features

@@ -32,8 +45,6 @@ When asked to implement new features:
 * write comprehensive tests first (expecting that these will initially fail)
 * and then iterate on the implementation until the tests pass.

-All new tests should go in `tests/lean/run/`. These tests don't have expected output; we just check there are no errors. You should use `#guard_msgs` to check for specific messages.
-
 ## Success Criteria

 *Never* report success on a task unless you have verified both a clean build without errors, and that the relevant tests pass.
@@ -41,9 +52,13 @@ All new tests should go in `tests/lean/run/`. These tests don't have expected ou
 ## Build System Safety

 **NEVER manually delete build directories** (build/, stage0/, stage1/, etc.) even when builds fail.
- ONLY use the project's documented build command: `make -j -C build/release`
+- ONLY use the project's documented build command: `make -j$(nproc) -C build/release`
 - If a build is broken, ask the user before attempting any manual cleanup

+## stage0 Is a Copy of src
+
+**Never manually edit files under `stage0/`.** The `stage0/` directory is a snapshot of `src/` produced by `make update-stage0`. To change anything in stage0 (CMakeLists.txt, C++ source, etc.), edit the corresponding file in `src/` and let `update-stage0` propagate it.
+
 ## LSP and IDE Diagnostics

 After rebuilding, LSP diagnostics may be stale until the user interacts with files. Trust command-line test results over IDE diagnostics.
@@ -59,7 +74,7 @@ Follow the commit convention in `doc/dev/commit_convention.md`.
 **Title format:** `<type>: <subject>` where type is one of: `feat`, `fix`, `doc`, `style`, `refactor`, `test`, `chore`, `perf`.
 Subject should use imperative present tense ("add" not "added"), no capitalization, no trailing period.

-**Body format:** The first paragraph must start with "This PR". This paragraph is automatically incorporated into release notes. Use imperative present tense. Include motivation and contrast with previous behavior when relevant.
+**Body format:** The first paragraph must start with "This PR". This paragraph is automatically incorporated into release notes. Use imperative present tense. Include motivation and contrast with previous behavior when relevant. Do NOT use markdown headings (`## Summary`, `## Test plan`, etc.) in PR bodies.

 Example:
 ```
@@ -84,6 +99,27 @@ leading quantifiers are stripped when creating a pattern.

 If you're unsure which label applies, it's fine to omit the label and let reviewers add it.

+## Module System for `src/` Files
+
+Files in `src/Lean/`, `src/Std/`, and `src/lake/Lake/` must have both `module` and `prelude` (CI enforces `^prelude$` on its own line). With `prelude`, nothing is auto-imported — you must explicitly import `Init.*` modules for standard library features. Check existing files in the same directory for the pattern, e.g.:
+
+```lean
+module
+
+prelude
+import Init.While  -- needed for while/repeat
+import Init.Data.String.TakeDrop  -- needed for String.startsWith
+public import Lean.Compiler.NameMangling  -- public if types are used in public signatures
+```
+
+Files outside these directories (e.g. `tests/`, `script/`) use just `module`.
+
 ## CI Log Retrieval

 When CI jobs fail, investigate immediately - don't wait for other jobs to complete. Individual job logs are often available even while other jobs are still running. Try `gh run view <run-id> --log` or `gh run view <run-id> --log-failed`, or use `gh run view <run-id> --job=<job-id>` to target the specific failed job. Sleeping is fine when asked to monitor CI and no failures exist yet, but once any job fails, investigate that failure immediately.
+
+## Copyright Headers
+
+New files require a copyright header. To get the year right, always run `date +%Y` rather than relying on memory. The copyright holder should be the author or their current employer — check other recent files by the same author in the repository to determine the correct entity (e.g., "Lean FRO, LLC", "Amazon.com, Inc. or its affiliates").
+
+Test files (in `tests/`) do not need copyright headers.
--- a/.claude/commands/release.md
+++ b/.claude/commands/release.md
@@ -103,6 +103,15 @@ Every time you run `release_checklist.py`, you MUST:
 This summary should be provided EVERY time you run the checklist, not just after creating new PRs.
 The user needs to see the complete picture of what's waiting for review.

+## Checking PR Status When Asked
+
+When the user asks for "status" or you need to report on PRs between checklist runs:
+- **ALWAYS check actual PR state** using `gh pr view <number> --repo <repo> --json state,mergedAt`
+- Do NOT rely on cached CI results or previous checklist output
+- The user may have merged PRs since your last check
+- Report which PRs are MERGED, which are OPEN with CI status, and which are still pending
+- After discovering merged PRs, rerun `release_checklist.py` to advance the release process
+
 ## Nightly Infrastructure

 The nightly build system uses branches and tags across two repositories:
@@ -112,6 +121,42 @@ The nightly build system uses branches and tags across two repositories:

 When a nightly succeeds with mathlib, all three should point to the same commit. Don't confuse these: branches are in the main lean4 repo, dated tags are in lean4-nightly.

+## CI Failures: Investigate Immediately
+
+**CRITICAL: If the checklist reports `❌ CI: X check(s) failing` for any PR, investigate immediately.**
+
+Do NOT:
+- Report it as "CI in progress" or "some checks pending"
+- Wait for the remaining checks to finish before investigating
+- Assume it's a transient failure without checking
+
+DO:
+1. Run `gh pr checks <number> --repo <owner>/<repo>` to see which specific check failed
+2. Run `gh run view <run-id> --repo <owner>/<repo> --log-failed` to see the failure output
+3. Diagnose the failure and report clearly to the user: what failed and why
+4. Propose a fix if one is obvious (e.g., subverso version mismatch, transient elan install error)
+
+The checklist now distinguishes `❌ X check(s) failing, Y still in progress` from `🔄 Y check(s) in progress`.
+Any `❌` in CI status requires immediate investigation — do not move on.
+
+## Waiting for CI or Merges
+
+Use `gh pr checks --watch` to block until a PR's CI checks complete (no polling needed).
+Run these as background bash commands so you get notified when they finish:
+
+```bash
+# Watch CI, then check merge state
+gh pr checks <number> --repo <owner>/<repo> --watch && gh pr view <number> --repo <owner>/<repo> --json state --jq '.state'
+```
+
+For multiple PRs, launch one background command per PR in parallel. When each completes,
+you'll be notified automatically via a task-notification. Do NOT use sleep-based polling
+loops — `--watch` is event-driven and exits as soon as checks finish.
+
+Note: `gh pr checks --watch` exits as soon as ALL checks complete (pass or fail). If some checks
+fail while others are still running, `--watch` will continue until everything settles, then exit
+with a non-zero code. So a background `--watch` finishing = all checks done; check which failed.
+
 ## Error Handling

 **CRITICAL**: If something goes wrong or a command fails:
--- a/.claude/settings.json
+++ b/.claude/settings.json
@@ -0,0 +1,13 @@
+{
+  "extraKnownMarketplaces": {
+    "leanprover": {
+      "source": {
+        "source": "github",
+        "repo": "leanprover/skills"
+      }
+    }
+  },
+  "enabledPlugins": {
+    "lean@leanprover": true
+  }
+}
--- a/.claude/skills/profiling/SKILL.md
+++ b/.claude/skills/profiling/SKILL.md
@@ -0,0 +1,26 @@
+---
+name: profiling
+description: Profile Lean programs with demangled names using samply and Firefox Profiler. Use when the user asks to profile a Lean binary or investigate performance.
+allowed-tools: Bash, Read, Glob, Grep
+---
+
+# Profiling Lean Programs
+
+Full documentation: `script/PROFILER_README.md`.
+
+## Quick Start
+
+```bash
+script/lean_profile.sh ./build/release/stage1/bin/lean some_file.lean
+```
+
+Requires `samply` (`cargo install samply`) and `python3`.
+
+## Agent Notes
+
+- The pipeline is interactive (serves to browser at the end). When running non-interactively, run the steps manually instead of using the wrapper script.
+- The three steps are: `samply record --save-only`, `symbolicate_profile.py`, then `serve_profile.py`.
+- `lean_demangle.py` works standalone as a stdin filter (like `c++filt`) for quick name lookups.
+- The `--raw` flag on `lean_demangle.py` gives exact demangled names without postprocessing (keeps `._redArg`, `._lam_0` suffixes as-is).
+- Use `PROFILE_KEEP=1` to keep the temp directory for later inspection.
+- The demangled profile is a standard Firefox Profiler JSON. Function names live in `threads[i].stringArray`, indexed by `threads[i].funcTable.name`.
--- a/.claude/skills/zulip-extract/SKILL.md
+++ b/.claude/skills/zulip-extract/SKILL.md
@@ -0,0 +1,17 @@
+---
+name: zulip-extract
+description: Extract Zulip thread HTML dumps into readable plain text. Use when the user provides a Zulip HTML file or asks to parse/read/convert/summarize a Zulip thread.
+---
+
+# Zulip Thread Extractor
+
+Run the bundled script to convert a Zulip HTML page dump into plain text.
+
+## Usage
+```bash
+python3 .claude/skills/zulip-extract/zulip_thread_extract.py input.html output.txt
+```
+
+The script has zero dependencies beyond Python 3 stdlib.
+It extracts sender, timestamp, message content (with code blocks,
+links, quotes, mentions), and reactions.
--- a/.claude/skills/zulip-extract/zulip_thread_extract.py
+++ b/.claude/skills/zulip-extract/zulip_thread_extract.py
@@ -0,0 +1,313 @@
+#!/usr/bin/env python3
+"""
+Convert a Zulip HTML page dump to plain text (the visible message thread).
+
+Zero external dependencies — uses only the Python standard library.
+
+Usage:
+    python3 zulip_thread_extract.py input.html [output.txt]
+"""
+
+import sys
+import re
+from html.parser import HTMLParser
+from html import unescape
+
+
+# ---------------------------------------------------------------------------
+# Minimal DOM built from stdlib HTMLParser
+# ---------------------------------------------------------------------------
+
+class Node:
+    """A lightweight DOM node."""
+    __slots__ = ('tag', 'attrs', 'children', 'parent', 'text')
+
+    def __init__(self, tag='', attrs=None):
+        self.tag = tag
+        self.attrs = dict(attrs) if attrs else {}
+        self.children = []
+        self.parent = None
+        self.text = ''  # for text nodes only (tag == '')
+
+    @property
+    def cls(self):
+        return self.attrs.get('class', '')
+
+    def has_class(self, c):
+        return c in self.cls.split()
+
+    def find_all(self, tag=None, class_=None):
+        """Depth-first search for matching descendants."""
+        for child in self.children:
+            if child.tag == '':
+                continue
+            match = True
+            if tag and child.tag != tag:
+                match = False
+            if class_ and not child.has_class(class_):
+                match = False
+            if match:
+                yield child
+            yield from child.find_all(tag, class_)
+
+    def find(self, tag=None, class_=None):
+        return next(self.find_all(tag, class_), None)
+
+    def get_text(self):
+        if self.tag == '':
+            return self.text
+        return ''.join(c.get_text() for c in self.children)
+
+
+class DOMBuilder(HTMLParser):
+    """Build a minimal DOM tree from HTML."""
+
+    VOID_ELEMENTS = frozenset([
+        'area', 'base', 'br', 'col', 'embed', 'hr', 'img', 'input',
+        'link', 'meta', 'param', 'source', 'track', 'wbr',
+    ])
+
+    def __init__(self):
+        super().__init__()
+        self.root = Node('root')
+        self._cur = self.root
+
+    def handle_starttag(self, tag, attrs):
+        node = Node(tag, attrs)
+        node.parent = self._cur
+        self._cur.children.append(node)
+        if tag not in self.VOID_ELEMENTS:
+            self._cur = node
+
+    def handle_endtag(self, tag):
+        # Walk up to find the matching open tag (tolerates misnesting)
+        n = self._cur
+        while n and n.tag != tag and n.parent:
+            n = n.parent
+        if n and n.parent:
+            self._cur = n.parent
+
+    def handle_data(self, data):
+        t = Node()
+        t.text = data
+        t.parent = self._cur
+        self._cur.children.append(t)
+
+    def handle_entityref(self, name):
+        self.handle_data(unescape(f'&{name};'))
+
+    def handle_charref(self, name):
+        self.handle_data(unescape(f'&#{name};'))
+
+
+def parse_html(path):
+    with open(path, 'r', encoding='utf-8') as f:
+        html = f.read()
+    builder = DOMBuilder()
+    builder.feed(html)
+    return builder.root
+
+
+# ---------------------------------------------------------------------------
+# Content extraction
+# ---------------------------------------------------------------------------
+
+SKIP_CLASSES = {
+    'message_controls', 'message_length_controller',
+    'code-buttons-container', 'copy_codeblock', 'code_external_link',
+    'message_edit_notice', 'edit-notifications',
+}
+
+def should_skip(node):
+    return bool(SKIP_CLASSES & set(node.cls.split()))
+
+
+def extract_content(node):
+    """Recursively convert a message_content node into readable text."""
+    parts = []
+    for child in node.children:
+        # Text node
+        if child.tag == '':
+            parts.append(child.text)
+            continue
+
+        if should_skip(child):
+            continue
+
+        cls_set = set(child.cls.split())
+
+        # Code block wrappers  (div.codehilite / div.zulip-code-block)
+        if child.tag == 'div' and ({'codehilite', 'zulip-code-block'} & cls_set):
+            code = child.find('code')
+            lang = child.attrs.get('data-code-language', '')
+            text = code.get_text() if code else child.get_text()
+            parts.append(f'\n```{lang}\n{text}```\n')
+            continue
+
+        # <pre> (bare code blocks without wrapper div)
+        if child.tag == 'pre':
+            code = child.find('code')
+            text = code.get_text() if code else child.get_text()
+            parts.append(f'\n```\n{text}```\n')
+            continue
+
+        # Inline <code>
+        if child.tag == 'code':
+            parts.append(f'`{child.get_text()}`')
+            continue
+
+        # Paragraph
+        if child.tag == 'p':
+            inner = extract_content(child)
+            parts.append(f'\n{inner}\n')
+            continue
+
+        # Line break
+        if child.tag == 'br':
+            parts.append('\n')
+            continue
+
+        # Links
+        if child.tag == 'a':
+            href = child.attrs.get('href', '')
+            text = child.get_text().strip()
+            if href and not href.startswith('#') and text:
+                parts.append(f'[{text}]({href})')
+            else:
+                parts.append(text)
+            continue
+
+        # Block quotes
+        if child.tag == 'blockquote':
+            bq = extract_content(child).strip()
+            parts.append('\n' + '\n'.join(f'> {l}' for l in bq.split('\n')) + '\n')
+            continue
+
+        # Lists
+        if child.tag in ('ul', 'ol'):
+            for i, li in enumerate(c for c in child.children if c.tag == 'li'):
+                pfx = f'{i+1}.' if child.tag == 'ol' else '-'
+                parts.append(f'\n{pfx} {extract_content(li).strip()}')
+            parts.append('\n')
+            continue
+
+        # User mentions
+        if 'user-mention' in cls_set:
+            parts.append(f'@{child.get_text().strip().lstrip("@")}')
+            continue
+
+        # Emoji
+        if 'emoji' in cls_set:
+            alt = child.attrs.get('alt', '') or child.attrs.get('title', '')
+            if alt:
+                parts.append(alt)
+            continue
+
+        # Recurse into everything else
+        parts.append(extract_content(child))
+
+    return ''.join(parts)
+
+
+# ---------------------------------------------------------------------------
+# Thread extraction
+# ---------------------------------------------------------------------------
+
+def extract_thread(html_path, output_path=None):
+    root = parse_html(html_path)
+
+    # Find the message list
+    msg_list = root.find('div', class_='message-list')
+    if not msg_list:
+        print("ERROR: Could not find message list.", file=sys.stderr)
+        sys.exit(1)
+
+    # Topic header
+    header = msg_list.find('div', class_='message_header')
+    stream_name = topic_name = date_str = ''
+    if header:
+        el = header.find('span', class_='message-header-stream-name')
+        if el: stream_name = el.get_text().strip()
+        el = header.find('span', class_='stream-topic-inner')
+        if el: topic_name = el.get_text().strip()
+        el = header.find('span', class_='recipient_row_date')
+        if el:
+            tr = el.find('span', class_='timerender-content')
+            if tr:
+                date_str = tr.attrs.get('data-tippy-content', '') or tr.get_text().strip()
+
+    # Messages
+    messages = []
+    for row in msg_list.find_all('div', class_='message_row'):
+        if not row.has_class('messagebox-includes-sender'):
+            continue
+
+        msg = {}
+
+        sn = row.find('span', class_='sender_name_text')
+        if sn:
+            un = sn.find('span', class_='user-name')
+            msg['sender'] = (un or sn).get_text().strip()
+
+        tm = row.find('a', class_='message-time')
+        if tm:
+            msg['time'] = tm.get_text().strip()
+
+        cd = row.find('div', class_='message_content')
+        if cd:
+            text = extract_content(cd)
+            text = re.sub(r'\n{3,}', '\n\n', text).strip()
+            msg['content'] = text
+
+        # Reactions
+        reactions = []
+        for rx in row.find_all('div', class_='message_reaction'):
+            em = rx.find('div', class_='emoji_alt_code')
+            if em:
+                reactions.append(em.get_text().strip())
+            else:
+                img = rx.find(tag='img')
+                if img:
+                    reactions.append(img.attrs.get('alt', ''))
+            cnt = rx.find('span', class_='message_reaction_count')
+            if cnt and reactions:
+                c = cnt.get_text().strip()
+                if c and c != '1':
+                    reactions[-1] += f' x{c}'
+        if reactions:
+            msg['reactions'] = reactions
+
+        if msg.get('content') or msg.get('sender'):
+            messages.append(msg)
+
+    # Format
+    lines = [
+        '=' * 70,
+        f'# {stream_name} > {topic_name}',
+    ]
+    if date_str:
+        lines.append(f'# Started: {date_str}')
+    lines += [f'# Messages: {len(messages)}', '=' * 70, '']
+
+    for msg in messages:
+        lines.append(f'--- {msg.get("sender","?")}  [{msg.get("time","")}] ---')
+        lines.append(msg.get('content', ''))
+        if msg.get('reactions'):
+            lines.append(f'  Reactions: {", ".join(msg["reactions"])}')
+        lines.append('')
+
+    result = '\n'.join(lines)
+    if output_path:
+        with open(output_path, 'w', encoding='utf-8') as f:
+            f.write(result)
+        print(f"Written {len(messages)} messages to {output_path}")
+    else:
+        print(result)
+
+
+if __name__ == '__main__':
+    if len(sys.argv) < 2:
+        print(f"Usage: {sys.argv[0]} input.html [output.txt]")
+        sys.exit(1)
+    extract_thread(sys.argv[1], sys.argv[2] if len(sys.argv) > 2 else None)
+
--- a/.gitattributes
+++ b/.gitattributes
@@ -5,9 +5,3 @@ stage0/** binary linguist-generated
 # The following file is often manually edited, so do show it in diffs
 stage0/src/stdlib_flags.h -binary -linguist-generated
 doc/std/grove/GroveStdlib/Generated/** linguist-generated
-# These files should not have line endings translated on Windows, because
-# it throws off parser tests. Later lines override earlier ones, so the
-# runner code is still treated as ordinary text.
-tests/lean/docparse/* eol=lf
-tests/lean/docparse/*.lean eol=auto
-tests/lean/docparse/*.sh eol=auto
--- a/.github/workflows/awaiting-manual.yml
+++ b/.github/workflows/awaiting-manual.yml
@@ -2,16 +2,19 @@ name: Check awaiting-manual label

 on:
  merge_group:
-  pull_request:
+  pull_request_target:
    types: [opened, synchronize, reopened, labeled, unlabeled]

+permissions:
+  pull-requests: read
+
 jobs:
  check-awaiting-manual:
    runs-on: ubuntu-latest
    steps:
      - name: Check awaiting-manual label
        id: check-awaiting-manual-label
-        if: github.event_name == 'pull_request'
+        if: github.event_name == 'pull_request_target'
        uses: actions/github-script@v8
        with:
          script: |
@@ -28,7 +31,7 @@ jobs:
            }
      
      - name: Wait for manual compatibility
-        if: github.event_name == 'pull_request' && steps.check-awaiting-manual-label.outputs.awaiting == 'true'
+        if: github.event_name == 'pull_request_target' && steps.check-awaiting-manual-label.outputs.awaiting == 'true'
        run: |
          echo "::notice title=Awaiting manual::PR is marked 'awaiting-manual' but neither 'breaks-manual' nor 'builds-manual' labels are present."
          echo "This check will remain in progress until the PR is updated with appropriate manual compatibility labels."
--- a/.github/workflows/awaiting-mathlib.yml
+++ b/.github/workflows/awaiting-mathlib.yml
@@ -2,16 +2,19 @@ name: Check awaiting-mathlib label

 on:
  merge_group:
-  pull_request:
+  pull_request_target:
    types: [opened, synchronize, reopened, labeled, unlabeled]

+permissions:
+  pull-requests: read
+
 jobs:
  check-awaiting-mathlib:
    runs-on: ubuntu-latest
    steps:
      - name: Check awaiting-mathlib label
        id: check-awaiting-mathlib-label
-        if: github.event_name == 'pull_request'
+        if: github.event_name == 'pull_request_target'
        uses: actions/github-script@v8
        with:
          script: |
@@ -28,7 +31,7 @@ jobs:
            }
      
      - name: Wait for mathlib compatibility
-        if: github.event_name == 'pull_request' && steps.check-awaiting-mathlib-label.outputs.awaiting == 'true'
+        if: github.event_name == 'pull_request_target' && steps.check-awaiting-mathlib-label.outputs.awaiting == 'true'
        run: |
          echo "::notice title=Awaiting mathlib::PR is marked 'awaiting-mathlib' but neither 'breaks-mathlib' nor 'builds-mathlib' labels are present."
          echo "This check will remain in progress until the PR is updated with appropriate mathlib compatibility labels."
--- a/.github/workflows/build-template.yml
+++ b/.github/workflows/build-template.yml
@@ -49,7 +49,7 @@ jobs:
      LSAN_OPTIONS: max_leaks=10
      # somehow MinGW clang64 (or cmake?) defaults to `g++` even though it doesn't exist
      CXX: c++
-      MACOSX_DEPLOYMENT_TARGET: 10.15
+      MACOSX_DEPLOYMENT_TARGET: 11.0
    steps:
      - name: Install Nix
        uses: DeterminateSystems/nix-installer-action@main
@@ -66,16 +66,10 @@ jobs:
          brew install ccache tree zstd coreutils gmp libuv
        if: runner.os == 'macOS'
      - name: Checkout
-        if: (!endsWith(matrix.os, '-with-cache'))
        uses: actions/checkout@v6
        with:
          # the default is to use a virtual merge commit between the PR and master: just use the PR
          ref: ${{ github.event.pull_request.head.sha }}
-      - name: Namespace Checkout
-        if: endsWith(matrix.os, '-with-cache')
-        uses: namespacelabs/nscloud-checkout-action@v8
-        with:
-          ref: ${{ github.event.pull_request.head.sha }}
      - name: Open Nix shell once
        run: true
        if: runner.os == 'Linux'
@@ -85,7 +79,7 @@ jobs:
      - name: CI Merge Checkout
        run: |
          git fetch --depth=1 origin ${{ github.sha }}
-          git checkout FETCH_HEAD flake.nix flake.lock script/prepare-* tests/lean/run/importStructure.lean
+          git checkout FETCH_HEAD flake.nix flake.lock script/prepare-* tests/elab/importStructure.lean
        if: github.event_name == 'pull_request'
      # (needs to be after "Checkout" so files don't get overridden)
      - name: Setup emsdk
@@ -235,25 +229,21 @@ jobs:
        # prefix `if` above with `always` so it's run even if tests failed
        if: always() && steps.test.conclusion != 'skipped'
      - name: Check Test Binary
-        run: ${{ matrix.binary-check }} tests/compiler/534.lean.out
+        run: ${{ matrix.binary-check }} tests/compile/534.lean.out
        if: (!matrix.cross) && steps.test.conclusion != 'skipped'
      - name: Build Stage 2
        run: |
          make -C build -j$NPROC stage2
-        if: matrix.test-speedcenter
+        if: matrix.test-bench
      - name: Check Stage 3
        run: |
          make -C build -j$NPROC check-stage3
        if: matrix.check-stage3
-      - name: Test Speedcenter Benchmarks
+      - name: Test Benchmarks
        run: |
-          # Necessary for some timing metrics but does not work on Namespace runners
-          # and we just want to test that the benchmarks run at all here
-          #echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
-          export BUILD=$PWD/build PATH=$PWD/build/stage1/bin:$PATH
-          cd tests/bench
-          nix shell .#temci -c temci exec --config speedcenter.yaml --included_blocks fast --runs 1
-        if: matrix.test-speedcenter
+          cd tests
+          nix develop -c make -C ../build -j$NPROC bench
+        if: matrix.test-bench
      - name: Check rebootstrap
        run: |
          set -e
--- a/.github/workflows/check-stdlib-flags.yml
+++ b/.github/workflows/check-stdlib-flags.yml
@@ -1,9 +1,12 @@
 name: Check stdlib_flags.h modifications

 on:
-  pull_request:
+  pull_request_target:
    types: [opened, synchronize, reopened, labeled, unlabeled]

+permissions:
+  pull-requests: read
+
 jobs:
  check-stdlib-flags:
    runs-on: ubuntu-latest
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -166,7 +166,7 @@ jobs:
      # 0: PRs without special label
      # 1: PRs with `merge-ci` label, merge queue checks, master commits
      # 2: nightlies
-      # 3: PRs with `release-ci` label, full releases
+      # 3: PRs with `release-ci` or `lake-ci` label, full releases
      - name: Set check level
        id: set-level
        # We do not use github.event.pull_request.labels.*.name here because
@@ -175,6 +175,7 @@ jobs:
        run: |
          check_level=0
          fast=false
+          lake_ci=false

          if [[ -n "${{ steps.set-release.outputs.RELEASE_TAG }}" || -n "${{ steps.set-release-custom.outputs.RELEASE_TAG }}" ]]; then
            check_level=3
@@ -189,13 +190,19 @@ jobs:
            elif echo "$labels" | grep -q "merge-ci"; then
              check_level=1
            fi
+            if echo "$labels" | grep -q "lake-ci"; then
+              lake_ci=true
+            fi
            if echo "$labels" | grep -q "fast-ci"; then
              fast=true
            fi
          fi

-          echo "check-level=$check_level" >> "$GITHUB_OUTPUT"
-          echo "fast=$fast" >> "$GITHUB_OUTPUT"
+          {
+            echo "check-level=$check_level"
+            echo "fast=$fast"
+            echo "lake-ci=$lake_ci"
+          } >> "$GITHUB_OUTPUT"
        env:
          GH_TOKEN: ${{ github.token }}

@@ -206,6 +213,7 @@ jobs:
          script: |
            const level = ${{ steps.set-level.outputs.check-level }};
            const fast = ${{ steps.set-level.outputs.fast }};
+            const lakeCi = "${{ steps.set-level.outputs.lake-ci }}" == "true";
            console.log(`level: ${level}, fast: ${fast}`);
            // use large runners where available (original repo)
            let large = ${{ github.repository == 'leanprover/lean4' }};
@@ -258,8 +266,8 @@ jobs:
                "check-rebootstrap": level >= 1,
                "check-stage3": level >= 2,
                "test": true,
-                // NOTE: `test-speedcenter` currently seems to be broken on `ubuntu-latest`
-                "test-speedcenter": large && level >= 2,
+                // NOTE: `test-bench` currently seems to be broken on `ubuntu-latest`
+                "test-bench": large && level >= 2,
                // We are not warning-free yet on all platforms, start here
                "CMAKE_OPTIONS": "-DLEAN_EXTRA_CXX_FLAGS=-Werror",
              },
@@ -269,6 +277,8 @@ jobs:
                "enabled": level >= 2,
                "test": true,
                "CMAKE_PRESET": "reldebug",
+                // * `elab_bench/big_do` crashes with exit code 134
+                "CTEST_OPTIONS": "-E 'elab_bench/big_do'",
              },
              {
                "name": "Linux fsanitize",
@@ -377,6 +387,11 @@ jobs:
                job["CMAKE_OPTIONS"] = (job["CMAKE_OPTIONS"] ? job["CMAKE_OPTIONS"] + " " : "") + "-DUSE_LAKE=OFF";
              }
            }
+            if (lakeCi) {
+              for (const job of matrix) {
+                job["CMAKE_OPTIONS"] = (job["CMAKE_OPTIONS"] ? job["CMAKE_OPTIONS"] + " " : "") + "-DLAKE_CI=ON";
+              }
+            }
            console.log(`matrix:\n${JSON.stringify(matrix, null, 2)}`);
            matrix = matrix.filter((job) => job["enabled"]);
            core.setOutput('matrix', matrix.filter((job) => !job["secondary"]));
--- a/.github/workflows/labels-from-comments.yml
+++ b/.github/workflows/labels-from-comments.yml
@@ -1,5 +1,5 @@
 # This workflow allows any user to add one of the `awaiting-review`, `awaiting-author`, `WIP`,
-# `release-ci`, or a `changelog-XXX` label by commenting on the PR or issue.
+# `release-ci`, `lake-ci`, or a `changelog-XXX` label by commenting on the PR or issue.
 # If any labels from the set {`awaiting-review`, `awaiting-author`, `WIP`} are added, other labels
 # from that set are removed automatically at the same time.
 # Similarly, if any `changelog-XXX` label is added, other `changelog-YYY` labels are removed.
@@ -12,7 +12,7 @@ on:

 jobs:
  update-label:
-    if: github.event.issue.pull_request != null && (contains(github.event.comment.body, 'awaiting-review') || contains(github.event.comment.body, 'awaiting-author') || contains(github.event.comment.body, 'WIP') || contains(github.event.comment.body, 'release-ci') || contains(github.event.comment.body, 'changelog-'))
+    if: github.event.issue.pull_request != null && (contains(github.event.comment.body, 'awaiting-review') || contains(github.event.comment.body, 'awaiting-author') || contains(github.event.comment.body, 'WIP') || contains(github.event.comment.body, 'release-ci') || contains(github.event.comment.body, 'lake-ci') || contains(github.event.comment.body, 'changelog-'))
    runs-on: ubuntu-latest

    steps:
@@ -28,6 +28,7 @@ jobs:
          const awaitingAuthor = commentLines.includes('awaiting-author');
          const wip = commentLines.includes('WIP');
          const releaseCI = commentLines.includes('release-ci');
+          const lakeCI = commentLines.includes('lake-ci');
          const changelogMatch = commentLines.find(line => line.startsWith('changelog-'));

          if (awaitingReview || awaitingAuthor || wip) {
@@ -49,6 +50,9 @@ jobs:
          if (releaseCI) {
            await github.rest.issues.addLabels({ owner, repo, issue_number, labels: ['release-ci'] });
          }
+          if (lakeCI) {
+            await github.rest.issues.addLabels({ owner, repo, issue_number, labels: ['lake-ci'] });
+          }

          if (changelogMatch) {
            const changelogLabel = changelogMatch.trim();
--- a/.github/workflows/pr-body.yml
+++ b/.github/workflows/pr-body.yml
@@ -2,17 +2,23 @@ name: Check PR body for changelog convention

 on:
  merge_group:
-  pull_request:
+  pull_request_target:
    types: [opened, synchronize, reopened, edited, labeled, converted_to_draft, ready_for_review]

+permissions:
+  pull-requests: read
+
 jobs:
  check-pr-body:
    runs-on: ubuntu-latest
    steps:
      - name: Check PR body
-        if: github.event_name == 'pull_request'
+        if: github.event_name == 'pull_request_target'
        uses: actions/github-script@v8
        with:
+          # Safety note: this uses pull_request_target, so the workflow has elevated privileges.
+          # The PR title and body are only used in regex tests (read-only string matching),
+          # never interpolated into shell commands, eval'd, or written to GITHUB_ENV/GITHUB_OUTPUT.
          script: |
            const { title, body, labels, draft } = context.payload.pull_request;
            if (!draft && /^(feat|fix):/.test(title) && !labels.some(label => label.name == "changelog-no")) {
--- a/.github/workflows/restart-on-label.yml
+++ b/.github/workflows/restart-on-label.yml
@@ -7,7 +7,7 @@ on:
 jobs:
  restart-on-label:
    runs-on: ubuntu-latest
-    if: contains(github.event.label.name, 'merge-ci') || contains(github.event.label.name, 'release-ci')
+    if: contains(github.event.label.name, 'merge-ci') || contains(github.event.label.name, 'release-ci') || contains(github.event.label.name, 'lake-ci')
    steps:
    - run: |
        # Finding latest CI workflow run on current pull request
--- a/.gitignore
+++ b/.gitignore
@@ -1,7 +1,6 @@
 *~
 \#*
 .#*
-*.lock
 .lake
 lake-manifest.json
 /build
@@ -18,8 +17,12 @@ compile_commands.json
 *.idea
 tasks.json
 settings.json
+!.claude/settings.json
 .gdb_history
 .vscode/*
+!.vscode/settings.json
+!.vscode/tasks.json
+!.vscode/extensions.json
 script/__pycache__
 *.produced.out
 CMakeSettings.json
--- a/.vscode/extensions.json
+++ b/.vscode/extensions.json
@@ -0,0 +1,5 @@
+{
+	"recommendations": [
+		"leanprover.lean4"
+	]
+}
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@@ -0,0 +1,12 @@
+{
+	"files.insertFinalNewline": true,
+	"files.trimTrailingWhitespace": true,
+	// These require the CMake Tools extension (ms-vscode.cmake-tools).
+	"cmake.buildDirectory": "${workspaceFolder}/build/release",
+	"cmake.generator": "Unix Makefiles",
+	"[lean4]": {
+		"editor.rulers": [
+			100
+		]
+	}
+}
--- a/.vscode/tasks.json
+++ b/.vscode/tasks.json
@@ -0,0 +1,34 @@
+{
+	"version": "2.0.0",
+	"tasks": [
+		{
+			"label": "build",
+			"type": "shell",
+			"command": "make -C build/release -j$(nproc 2>/dev/null || sysctl -n hw.logicalcpu 2>/dev/null || echo 4)",
+			"problemMatcher": [],
+			"group": {
+				"kind": "build",
+				"isDefault": true
+			}
+		},
+		{
+			"label": "build-old",
+			"type": "shell",
+			"command": "make -C build/release -j$(nproc 2>/dev/null || sysctl -n hw.logicalcpu 2>/dev/null || echo 4) LAKE_EXTRA_ARGS=--old",
+			"problemMatcher": [],
+			"group": {
+				"kind": "build"
+			}
+		},
+		{
+			"label": "test",
+			"type": "shell",
+			"command": "NPROC=$(nproc 2>/dev/null || sysctl -n hw.logicalcpu 2>/dev/null || echo 4); CTEST_OUTPUT_ON_FAILURE=1 make -C build/release test -j$NPROC ARGS=\"-j$NPROC\"",
+			"problemMatcher": [],
+			"group": {
+				"kind": "test",
+				"isDefault": true
+			}
+		}
+	]
+}
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -1,4 +1,8 @@
-cmake_minimum_required(VERSION 3.11)
+cmake_minimum_required(VERSION 3.21)
+
+if(NOT CMAKE_GENERATOR MATCHES "Makefiles")
+  message(FATAL_ERROR "Only makefile generators are supported")
+endif()

 option(USE_MIMALLOC "use mimalloc" ON)

@@ -37,7 +41,7 @@ if(NOT (DEFINED STAGE0_CMAKE_EXECUTABLE_SUFFIX))
  set(STAGE0_CMAKE_EXECUTABLE_SUFFIX "${CMAKE_EXECUTABLE_SUFFIX}")
 endif()

-# Don't do anything with cadical on wasm
+# Don't do anything with cadical/leantar on wasm
 if(NOT CMAKE_SYSTEM_NAME MATCHES "Emscripten")
  find_program(CADICAL cadical)
  if(NOT CADICAL)
@@ -70,16 +74,47 @@ if(NOT CMAKE_SYSTEM_NAME MATCHES "Emscripten")
      BUILD_IN_SOURCE ON
      INSTALL_COMMAND ""
    )
-    set(
-      CADICAL
-      ${CMAKE_BINARY_DIR}/cadical/cadical${CMAKE_EXECUTABLE_SUFFIX}
-      CACHE FILEPATH
-      "path to cadical binary"
-      FORCE
-    )
+    set(CADICAL ${CMAKE_BINARY_DIR}/cadical/cadical${CMAKE_EXECUTABLE_SUFFIX})
    list(APPEND EXTRA_DEPENDS cadical)
  endif()
-  list(APPEND CL_ARGS -DCADICAL=${CADICAL})
+  find_program(LEANTAR leantar)
+  if(NOT LEANTAR)
+    set(LEANTAR_VERSION v0.1.19)
+    if(CMAKE_SYSTEM_NAME MATCHES "Windows")
+      set(LEANTAR_ARCHIVE_SUFFIX .zip)
+      set(LEANTAR_TARGET x86_64-pc-windows-msvc)
+    else()
+      set(LEANTAR_ARCHIVE_SUFFIX .tar.gz)
+      if(CMAKE_SYSTEM_PROCESSOR MATCHES "arm64")
+        set(LEANTAR_TARGET_ARCH aarch64)
+      else()
+        set(LEANTAR_TARGET_ARCH x86_64)
+      endif()
+      if(CMAKE_SYSTEM_NAME MATCHES "Darwin")
+        set(LEANTAR_TARGET_OS apple-darwin)
+      else()
+        set(LEANTAR_TARGET_OS unknown-linux-musl)
+      endif()
+      set(LEANTAR_TARGET ${LEANTAR_TARGET_ARCH}-${LEANTAR_TARGET_OS})
+    endif()
+    set(
+      LEANTAR
+      ${CMAKE_BINARY_DIR}/leantar/leantar-${LEANTAR_VERSION}-${LEANTAR_TARGET}/leantar${CMAKE_EXECUTABLE_SUFFIX}
+    )
+    if(NOT EXISTS "${LEANTAR}")
+      file(
+        DOWNLOAD
+          https://github.com/digama0/leangz/releases/download/${LEANTAR_VERSION}/leantar-${LEANTAR_VERSION}-${LEANTAR_TARGET}${LEANTAR_ARCHIVE_SUFFIX}
+        ${CMAKE_BINARY_DIR}/leantar${LEANTAR_ARCHIVE_SUFFIX}
+      )
+      file(
+        ARCHIVE_EXTRACT
+        INPUT ${CMAKE_BINARY_DIR}/leantar${LEANTAR_ARCHIVE_SUFFIX}
+        DESTINATION ${CMAKE_BINARY_DIR}/leantar
+      )
+    endif()
+  endif()
+  list(APPEND CL_ARGS -DCADICAL=${CADICAL} -DLEANTAR=${LEANTAR})
 endif()

 if(USE_MIMALLOC)
@@ -153,6 +188,7 @@ ExternalProject_Add(
  INSTALL_COMMAND ""
  DEPENDS stage2
  EXCLUDE_FROM_ALL ON
+  STEP_TARGETS configure
 )

 # targets forwarded to appropriate stages
@@ -163,6 +199,25 @@ add_custom_target(update-stage0-commit COMMAND $(MAKE) -C stage1 update-stage0-c

 add_custom_target(test COMMAND $(MAKE) -C stage1 test DEPENDS stage1)

+add_custom_target(
+  bench
+  COMMAND $(MAKE) -C stage2
+  COMMAND $(MAKE) -C stage2 -j1 bench
+  DEPENDS stage2
+)
+add_custom_target(
+  bench-part1
+  COMMAND $(MAKE) -C stage2
+  COMMAND $(MAKE) -C stage2 -j1 bench-part1
+  DEPENDS stage2
+)
+add_custom_target(
+  bench-part2
+  COMMAND $(MAKE) -C stage2
+  COMMAND $(MAKE) -C stage2 -j1 bench-part2
+  DEPENDS stage2
+)
+
 add_custom_target(clean-stdlib COMMAND $(MAKE) -C stage1 clean-stdlib DEPENDS stage1)

 install(CODE "execute_process(COMMAND make -C stage1 install)")
--- a/CMakePresets.json
+++ b/CMakePresets.json
@@ -41,7 +41,7 @@
        "SMALL_ALLOCATOR": "OFF",
        "USE_MIMALLOC": "OFF",
        "BSYMBOLIC": "OFF",
-        "LEAN_TEST_VARS": "MAIN_STACK_SIZE=16000 LSAN_OPTIONS=max_leaks=10"
+        "LEAN_TEST_VARS": "MAIN_STACK_SIZE=16000 TEST_STACK_SIZE=16000 LSAN_OPTIONS=max_leaks=10"
      },
      "generator": "Unix Makefiles",
      "binaryDir": "${sourceDir}/build/sanitize"
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -7,7 +7,7 @@ Helpful links
 -------

 * [Development Setup](./doc/dev/index.md)
-* [Testing](./doc/dev/testing.md)
+* [Testing](./tests/README.md)
 * [Commit convention](./doc/dev/commit_convention.md)

 Before You Submit a Pull Request (PR):
--- a/206
+++ b/206
@@ -1370,4 +1370,208 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.
+SOFTWARE.
+==============================================================================
+leantar is by Mario Carneiro and distributed under the Apache 2.0 License:
+==============================================================================
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/doc/.gitignore
+++ b/doc/.gitignore
@@ -1 +0,0 @@
-out
--- a/doc/dev/index.md
+++ b/doc/dev/index.md
@@ -1,7 +1,9 @@
 # Development Workflow

 If you want to make changes to Lean itself, start by [building Lean](../make/index.md) from a clean checkout to make sure that everything is set up correctly.
-After that, read on below to find out how to set up your editor for changing the Lean source code, followed by further sections of the development manual where applicable such as on the [test suite](testing.md) and [commit convention](commit_convention.md).
+After that, read on below to find out how to set up your editor for changing the Lean source code,
+followed by further sections of the development manual where applicable
+such as on the [test suite](../../tests/README.md) and [commit convention](commit_convention.md).

 If you are planning to make any changes that may affect the compilation of Lean itself, e.g. changes to the parser, elaborator, or compiler, you should first read about the [bootstrapping pipeline](bootstrap.md).
 You should not edit the `stage0` directory except using the commands described in that section when necessary.
@@ -61,10 +63,10 @@ you can then put `my_name/lean4:my-tag` in your `lean-toolchain` file in a proje

 ### VS Code

-There is a `lean.code-workspace` file that correctly sets up VS Code with workspace roots for the stage0/stage1 setup described above as well as with other settings.
-You should always load it when working on Lean, such as by invoking
+There is a `.vscode/` directory that correctly sets up VS Code with settings, tasks, and recommended extensions.
+Simply open the repository folder in VS Code, such as by invoking
 ```
-code lean.code-workspace
+code .
 ```
 on the command line.

--- a/doc/dev/release_checklist.md
+++ b/doc/dev/release_checklist.md
@@ -70,6 +70,9 @@ We'll use `v4.6.0` as the intended release version as a running example.
        The `release_steps.py` script handles this automatically by looking up the latest
        ProofWidgets4 tag compatible with the target toolchain.
      - Push the PR branch to the main Mathlib repository rather than a fork, or CI may not work reliably
+      - The "Verify Transient and Automated Commits" CI check on toolchain bump PRs can be ignored —
+        it often fails on automated commits (`x:` prefixed) from the nightly-testing history that can't be
+        reproduced in CI. This does not block merging.
    - `repl`:
      There are two copies of `lean-toolchain`/`lakefile.lean`:
      in the root, and in `test/Mathlib/`. Edit both, and run `lake update` in both directories.
@@ -150,6 +153,9 @@ We'll use `v4.7.0-rc1` as the intended release version in this example.
    * The repository does not need any changes to move to the new version.
    * Note that sometimes there are *unreviewed* but necessary changes on the `nightly-testing` branch of the repository.
      If so, you will need to merge these into the `bump_to_v4.7.0-rc1` branch manually.
+    * The `nightly-testing` branch may also contain temporary fix scripts (e.g. `fix_backward_defeq.py`,
+      `fix_deprecations.py`) that were used to adapt to breaking changes during the nightly cycle.
+      These should be reviewed and removed if no longer needed, as they can interfere with CI checks.
  - For each of the repositories listed in `script/release_repos.yml`,
    - Run `script/release_steps.py v4.7.0-rc1 <repo>` (e.g. replacing `<repo>` with `batteries`), which will walk you through the following steps:
      - Create a new branch off `master`/`main` (as specified in the `branch` field), called `bump_to_v4.7.0-rc1`.
--- a/doc/dev/testing.md
+++ b/doc/dev/testing.md
@@ -1,138 +0,0 @@
-# Test Suite
-
-After [building Lean](../make/index.md) you can run all the tests using
-```
-cd build/release
-make test ARGS=-j4
-```
-Change the 4 to the maximum number of parallel tests you want to
-allow. The best choice is the number of CPU cores on your machine as
-the tests are mostly CPU bound.  You can find the number of processors
-on linux using `nproc` and on Windows it is the `NUMBER_OF_PROCESSORS`
-environment variable.
-
-You can run tests after [building a specific stage](bootstrap.md) by
-adding the `-C stageN` argument. The default when run as above is stage 1.  The
-Lean tests will automatically use that stage's corresponding Lean
-executables
-
-Running `make test` will not pick up new test files; run
-```bash
-cmake build/release/stage1
-```
-to update the list of tests.
-
-You can also use `ctest` directly if you are in the right folder.  So
-to run stage1 tests with a 300 second timeout run this:
-
-```bash
-cd build/release/stage1
-ctest -j 4 --output-on-failure --timeout 300
-```
-Useful `ctest` flags are `-R <name of test>` to run a single test, and
-`--rerun-failed` to run all tests that failed during the last run.
-You can also pass `ctest` flags via `make test ARGS="--rerun-failed"`.
-
-To get verbose output from ctest pass the `--verbose` command line
-option. Test output is normally suppressed and only summary
-information is displayed. This option will show all test output.
-
-## Test Suite Organization
-
-All these tests are included by [src/shell/CMakeLists.txt](https://github.com/leanprover/lean4/blob/master/src/shell/CMakeLists.txt):
-
- [`tests/lean`](https://github.com/leanprover/lean4/tree/master/tests/lean/): contains tests that come equipped with a
-  .lean.expected.out file. The driver script [`test_single.sh`](https://github.com/leanprover/lean4/tree/master/tests/lean/test_single.sh) runs
-  each test and checks the actual output (*.produced.out) with the
-  checked in expected output.
-
- [`tests/lean/run`](https://github.com/leanprover/lean4/tree/master/tests/lean/run/): contains tests that are run through the lean
-  command line one file at a time. These tests only look for error
-  codes and do not check the expected output even though output is
-  produced, it is ignored.
-
-  **Note:** Tests in this directory run with `-Dlinter.all=false` to reduce noise.
-  If your test needs to verify linter behavior (e.g., deprecation warnings),
-  explicitly enable the relevant linter with `set_option linter.<name> true`.
-
- [`tests/lean/interactive`](https://github.com/leanprover/lean4/tree/master/tests/lean/interactive/): are designed to test server requests at a
-  given position in the input file. Each .lean file contains comments
-  that indicate how to simulate a client request at that position.
-  using a `--^` point to the line position. Example:
-    ```lean,ignore
-    open Foo in
-    theorem tst2 (h : a ≤ b) : a + 2 ≤ b + 2 :=
-    Bla.
-      --^ completion
-    ```
-    In this example, the test driver [`test_single.sh`](https://github.com/leanprover/lean4/tree/master/tests/lean/interactive/test_single.sh) will simulate an
-    auto-completion request at `Bla.`. The expected output is stored in
-    a .lean.expected.out in the json format that is part of the
-    [Language Server
-    Protocol](https://microsoft.github.io/language-server-protocol/).
-
-    This can also be used to test the following additional requests:
-    ```
-    --^ textDocument/hover
-    --^ textDocument/typeDefinition
-    --^ textDocument/definition
-    --^ $/lean/plainGoal
-    --^ $/lean/plainTermGoal
-    --^ insert: ...
-    --^ collectDiagnostics
-    ```
-
- [`tests/lean/server`](https://github.com/leanprover/lean4/tree/master/tests/lean/server/): Tests more of the Lean `--server` protocol.
-  There are just a few of them, and it uses .log files containing
-  JSON.
-
- [`tests/compiler`](https://github.com/leanprover/lean4/tree/master/tests/compiler/): contains tests that will run the Lean compiler and
-  build an executable that is executed and the output is compared to
-  the .lean.expected.out file. This test also contains a subfolder
-  [`foreign`](https://github.com/leanprover/lean4/tree/master/tests/compiler/foreign/) which shows how to extend Lean using C++.
-
- [`tests/lean/trust0`](https://github.com/leanprover/lean4/tree/master/tests/lean/trust0): tests that run Lean in a mode that Lean doesn't
-  even trust the .olean files (i.e., trust 0).
-
- [`tests/bench`](https://github.com/leanprover/lean4/tree/master/tests/bench/): contains performance tests.
-
- [`tests/plugin`](https://github.com/leanprover/lean4/tree/master/tests/plugin/): tests that compiled Lean code can be loaded into
-  `lean` via the `--plugin` command line option.
-
-## Writing Good Tests
-
-Every test file should contain:
-* an initial `/-! -/` module docstring summarizing the test's purpose
-* a module docstring for each test section that describes what is tested
-  and, if not 100% clear, why that is the desirable behavior
-
-At the time of writing, most tests do not follow these new guidelines yet.
-For an example of a conforming test, see [`tests/lean/1971.lean`](https://github.com/leanprover/lean4/tree/master/tests/lean/1971.lean).
-
-## Fixing Tests
-
-When the Lean source code or the standard library are modified, some of the
-tests break because the produced output is slightly different, and we have
-to reflect the changes in the `.lean.expected.out` files.
-We should not blindly copy the new produced output since we may accidentally
-miss a bug introduced by recent changes.
-The test suite contains commands that allow us to see what changed in a convenient way.
-First, we must install [meld](http://meldmerge.org/). On Ubuntu, we can do it by simply executing
-
-```
-sudo apt-get install meld
-```
-
-Now, suppose `bad_class.lean` test is broken. We can see the problem by going to [`tests/lean`](https://github.com/leanprover/lean4/tree/master/tests/lean) directory and
-executing
-
-```
-./test_single.sh -i bad_class.lean
-```
-
-When the `-i` option is provided, `meld` is automatically invoked
-whenever there is discrepancy between the produced and expected
-outputs. `meld` can also be used to repair the problems.
-
-In Emacs, we can also execute `M-x lean4-diff-test-file` to check/diff the file of the current buffer.
-To mass-copy all `.produced.out` files to the respective `.expected.out` file, use `tests/lean/copy-produced`.
--- a/doc/examples/.gitignore
+++ b/doc/examples/.gitignore
@@ -0,0 +1,2 @@
+*.out.produced
+*.exit.produced
--- a/doc/examples/bintree.lean.out.expected
+++ b/doc/examples/bintree.lean.out.expected
@@ -0,0 +1,2 @@
+Tree.node (Tree.node (Tree.leaf) 1 "one" (Tree.leaf)) 2 "two" (Tree.node (Tree.leaf) 3 "three" (Tree.leaf))
+[(1, "one"), (2, "two"), (3, "three")]
--- a/doc/examples/compiler/run_test.sh
+++ b/doc/examples/compiler/run_test.sh
@@ -0,0 +1,4 @@
+leanmake --always-make bin
+
+capture ./build/bin/test hello world
+check_out_contains "[hello, world]"
--- a/doc/examples/compiler/test.lean.out.expected
+++ b/doc/examples/compiler/test.lean.out.expected
@@ -0,0 +1 @@
+[hello, world]
--- a/doc/examples/interp.lean.out.expected
+++ b/doc/examples/interp.lean.out.expected
@@ -0,0 +1,3 @@
+30
+interp.lean:146:4: warning: declaration uses `sorry`
+3628800
--- a/doc/examples/palindromes.lean.out.expected
+++ b/doc/examples/palindromes.lean.out.expected
@@ -0,0 +1,2 @@
+true
+false
--- a/doc/examples/phoas.lean.out.expected
+++ b/doc/examples/phoas.lean.out.expected
@@ -0,0 +1,2 @@
+"(((fun x_1 => (fun x_2 => (x_1 + x_2))) 1) 2)"
+"((((fun x_1 => (fun x_2 => (x_1 + x_2))) 1) 2) + 5)"
--- a/doc/examples/run_test.sh
+++ b/doc/examples/run_test.sh
@@ -0,0 +1,4 @@
+capture_only "$1" \
+  lean -Dlinter.all=false "$1"
+check_exit_is_success
+check_out_file
--- a/doc/examples/test_single.sh
+++ b/doc/examples/test_single.sh
@@ -1,4 +0,0 @@
-#!/usr/bin/env bash
-source ../../tests/common.sh
-
-exec_check_raw lean -Dlinter.all=false "$f"
--- a/doc/std/grove/lean-toolchain
+++ b/doc/std/grove/lean-toolchain
@@ -1 +1 @@
-lean4
+../../../build/release/stage1
--- a/2
+++ b/2
@@ -1 +1 @@
-lean4
+build/release/stage1
--- a/lean.code-workspace
+++ b/lean.code-workspace
@@ -1,72 +0,0 @@
-{
-	"folders": [
-		{
-			"path": "."
-		},
-		{
-			"path": "src"
-		},
-		{
-			"path": "tests"
-		},
-		{
-			"path": "script"
-		}
-	],
-	"settings": {
-		// Open terminal at root, not current workspace folder
-		// (there is not way to directly refer to the root folder included as `.` above)
-		"terminal.integrated.cwd": "${workspaceFolder:src}/..",
-		"files.insertFinalNewline": true,
-		"files.trimTrailingWhitespace": true,
-		"cmake.buildDirectory": "${workspaceFolder}/build/release",
-		"cmake.generator": "Unix Makefiles",
-		"[markdown]": {
-			"rewrap.wrappingColumn": 70
-		},
-		"[lean4]": {
-			"editor.rulers": [
-				100
-			]
-		}
-	},
-	"tasks": {
-		"version": "2.0.0",
-		"tasks": [
-			{
-				"label": "build",
-				"type": "shell",
-				"command": "make -C build/release -j$(nproc 2>/dev/null || sysctl -n hw.logicalcpu 2>/dev/null || echo 4)",
-				"problemMatcher": [],
-				"group": {
-					"kind": "build",
-					"isDefault": true
-				}
-			},
-			{
-				"label": "build-old",
-				"type": "shell",
-				"command": "make -C build/release -j$(nproc 2>/dev/null || sysctl -n hw.logicalcpu 2>/dev/null || echo 4) LAKE_EXTRA_ARGS=--old",
-				"problemMatcher": [],
-				"group": {
-					"kind": "build"
-				}
-			},
-			{
-				"label": "test",
-				"type": "shell",
-				"command": "NPROC=$(nproc 2>/dev/null || sysctl -n hw.logicalcpu 2>/dev/null || echo 4); CTEST_OUTPUT_ON_FAILURE=1 make -C build/release test -j$NPROC ARGS=\"-j$NPROC\"",
-				"problemMatcher": [],
-				"group": {
-					"kind": "test",
-					"isDefault": true
-				}
-			}
-		]
-	},
-	"extensions": {
-		"recommendations": [
-			"leanprover.lean4"
-		]
-	}
-}
--- a/releases_drafts/environment.md
+++ b/releases_drafts/environment.md
@@ -1,6 +0,0 @@
-**Breaking Changes**
-
-* The functions `Lean.Environment.importModules` and `Lean.Environment.finalizeImport` have been extended with a new parameter `loadExts : Bool := false` that enables environment extension state loading.
-  Their previous behavior corresponds to setting the flag to `true` but is only safe to do in combination with `enableInitializersExecution`; see also the `importModules` docstring.
-  The new default value `false` ensures the functions can be used correctly multiple times within the same process when environment extension access is not needed.
-  The wrapper function `Lean.Environment.withImportModules` now always calls `importModules` with `loadExts := false` as it is incompatible with extension loading.
--- a/releases_drafts/module-system.md
+++ b/releases_drafts/module-system.md
@@ -1,54 +0,0 @@
-This release introduces the Lean module system, which allows files to
-control the visibility of their contents for other files. In previous
-releases, this feature was available as a preview when the option
-`experimental.module` was set to `true`; it is now a fully supported
-feature of Lean.
-
-# Benefits
-
-Because modules reduce the amount of information exposed to other
-code, they speed up rebuilds because irrelevant changes can be
-ignored, they make it possible to be deliberate about API evolution by
-hiding details that may change from clients, they help proofs be
-checked faster by avoiding accidentally unfolding definitions, and
-they lead to smaller executable files through improved dead code
-elimination.
-
-# Visibility
-
-A source file is a module if it begins with the `module` keyword.  By
-default, declarations in a module are private; the `public` modifier
-exports them. Proofs of theorems and bodies of definitions are private
-by default even when their signatures are public; the bodies of
-definitions can be made public by adding the `@[expose]`
-attribute. Theorems and opaque constants never expose their bodies.
-
-`public section` and `@[expose] section` change the default visibility
-of declarations in the section.
-
-# Imports
-
-Modules may only import other modules. By default, `import` adds the
-public information of the imported module to the private scope of the
-current module. Adding the `public` modifier to an import places the
-imported modules's public information in the public scope of the
-current module, exposing it in turn to the current module's clients.
-
-Within a package, `import all` can be used to import another module's
-private scope into the current module; this can be used to separate
-lemmas or tests from definition modules without exposing details to
-downstream clients.
-
-# Meta Code
-
-Code used in metaprograms must be marked `meta`. This ensures that the
-code is compiled and available for execution when it is needed during
-elaboration. Meta code may only reference other meta code. A whole
-module can be made available in the meta phase using `meta import`;
-this allows code to be shared across phases by importing the module in
-each phase. Code that is reachable from public metaprograms must be
-imported via `public meta import`, while local metaprograms can use
-plain `meta import` for their dependencies.
-
-
-The module system is described in detail in [the Lean language reference](https://lean-reference-manual-review.netlify.app/find/?domain=Verso.Genre.Manual.section&name=files).
--- a/script/benchReelabRss.lean
+++ b/script/benchReelabRss.lean
@@ -83,7 +83,7 @@ def main (args : List String) : IO Unit := do
      lastRSS? := some rss

    let avgRSSDelta := totalRSSDelta / (n - 2)
-    IO.println s!"avg-reelab-rss-delta: {avgRSSDelta}"
+    IO.println s!"measurement: avg-reelab-rss-delta {avgRSSDelta*1024} b"

    let _ ← Ipc.collectDiagnostics requestNo uri versionNo
    (← Ipc.stdin).writeLspMessage (Message.notification "exit" none)
--- a/script/benchReelabWatchdogRss.lean
+++ b/script/benchReelabWatchdogRss.lean
@@ -82,7 +82,7 @@ def main (args : List String) : IO Unit := do
      lastRSS? := some rss

    let avgRSSDelta := totalRSSDelta / (n - 2)
-    IO.println s!"avg-reelab-rss-delta: {avgRSSDelta}"
+    IO.println s!"measurement: avg-reelab-rss-delta {avgRSSDelta*1024} b"

    let _ ← Ipc.collectDiagnostics requestNo uri versionNo
    Ipc.shutdown requestNo
--- a/script/fmt
+++ b/script/fmt
@@ -9,5 +9,5 @@ find -regex '.*/CMakeLists\.txt\(\.in\)?\|.*\.cmake\(\.in\)?' \
  ! -path "./stage0/*" \
  -exec \
    uvx gersemi --in-place --line-length 120 --indent 2 \
-    --definitions src/cmake/Modules/ src/CMakeLists.txt \
+    --definitions src/cmake/Modules/ src/CMakeLists.txt tests/CMakeLists.txt \
    -- {} +
--- a/script/gen_constants_cpp.py
+++ b/script/gen_constants_cpp.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
 # -*- coding: utf-8 -*-
 #
 # Copyright (c) 2015 Microsoft Corporation. All rights reserved.
--- a/script/gen_tokens_cpp.py
+++ b/script/gen_tokens_cpp.py
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/env python3
 # -*- coding: utf-8 -*-
 #
 # Copyright (c) 2015 Microsoft Corporation. All rights reserved.
--- a/script/lean-toolchain
+++ b/script/lean-toolchain
@@ -1 +1 @@
-lean4
+../build/release/stage1
--- a/script/lean_profile.sh
+++ b/script/lean_profile.sh
@@ -1,4 +1,4 @@
-#!/bin/bash
+#!/usr/bin/env bash
 # Profile a Lean binary with demangled names.
 #
 # Usage:
--- a/script/lib/update-stage0
+++ b/script/lib/update-stage0
@@ -1,7 +1,7 @@
 #!/usr/bin/env bash
 set -euo pipefail

-rm -r stage0 || true
+rm -rf stage0 || true
 # don't copy untracked files
 # `:!` is git glob flavor for exclude patterns
 for f in $(git ls-files src ':!:src/lake/*' ':!:src/Leanc.lean'); do
--- a/script/profiler/lean_demangle.py
+++ b/script/profiler/lean_demangle.py
@@ -1,9 +1,11 @@
 #!/usr/bin/env python3
 """
-Lean name demangler.
+Lean name demangler — thin wrapper around the Lean CLI tool.

-Demangles C symbol names produced by the Lean 4 compiler back into
-readable Lean hierarchical names.
+Spawns ``lean --run lean_demangle_cli.lean`` as a persistent subprocess
+and communicates via stdin/stdout pipes. This ensures a single source
+of truth for demangling logic (the Lean implementation in
+``Lean.Compiler.NameDemangling``).

 Usage as a filter (like c++filt):
    echo "l_Lean_Meta_Sym_main" | python lean_demangle.py
@@ -13,767 +15,68 @@ Usage as a module:
    print(demangle_lean_name("l_Lean_Meta_Sym_main"))
 """

+import atexit
+import os
+import subprocess
 import sys

-
-# ---------------------------------------------------------------------------
-# String.mangle / unmangle
-# ---------------------------------------------------------------------------
-
-def _is_ascii_alnum(ch):
-    """Check if ch is an ASCII letter or digit (matching Lean's isAlpha/isDigit)."""
-    return ('a' <= ch <= 'z') or ('A' <= ch <= 'Z') or ('0' <= ch <= '9')
-
-
-def mangle_string(s):
-    """Port of Lean's String.mangle: escape a single string for C identifiers."""
-    result = []
-    for ch in s:
-        if _is_ascii_alnum(ch):
-            result.append(ch)
-        elif ch == '_':
-            result.append('__')
-        else:
-            code = ord(ch)
-            if code < 0x100:
-                result.append('_x' + format(code, '02x'))
-            elif code < 0x10000:
-                result.append('_u' + format(code, '04x'))
-            else:
-                result.append('_U' + format(code, '08x'))
-    return ''.join(result)
-
-
-def _parse_hex(s, pos, n):
-    """Parse n lowercase hex digits at pos. Returns (new_pos, value) or None."""
-    if pos + n > len(s):
-        return None
-    val = 0
-    for i in range(n):
-        c = s[pos + i]
-        if '0' <= c <= '9':
-            val = (val << 4) | (ord(c) - ord('0'))
-        elif 'a' <= c <= 'f':
-            val = (val << 4) | (ord(c) - ord('a') + 10)
-        else:
-            return None
-    return (pos + n, val)
-
-
-# ---------------------------------------------------------------------------
-# Name mangling (for round-trip verification)
-# ---------------------------------------------------------------------------
-
-def _check_disambiguation(m):
-    """Port of Lean's checkDisambiguation: does mangled string m need a '00' prefix?"""
-    pos = 0
-    while pos < len(m):
-        ch = m[pos]
-        if ch == '_':
-            pos += 1
-            continue
-        if ch == 'x':
-            return _parse_hex(m, pos + 1, 2) is not None
-        if ch == 'u':
-            return _parse_hex(m, pos + 1, 4) is not None
-        if ch == 'U':
-            return _parse_hex(m, pos + 1, 8) is not None
-        if '0' <= ch <= '9':
-            return True
-        return False
-    # all underscores or empty
-    return True
-
-
-def _need_disambiguation(prev_component, mangled_next):
-    """Port of Lean's needDisambiguation."""
-    # Check if previous component (as a string) ends with '_'
-    prev_ends_underscore = (isinstance(prev_component, str) and
-                            len(prev_component) > 0 and
-                            prev_component[-1] == '_')
-    return prev_ends_underscore or _check_disambiguation(mangled_next)
-
-
-def mangle_name(components, prefix="l_"):
-    """
-    Mangle a list of name components (str or int) into a C symbol.
-    Port of Lean's Name.mangle.
-    """
-    if not components:
-        return prefix
-
-    parts = []
-    prev = None
-    for i, comp in enumerate(components):
-        if isinstance(comp, int):
-            if i == 0:
-                parts.append(str(comp) + '_')
-            else:
-                parts.append('_' + str(comp) + '_')
-        else:
-            m = mangle_string(comp)
-            if i == 0:
-                if _check_disambiguation(m):
-                    parts.append('00' + m)
-                else:
-                    parts.append(m)
-            else:
-                if _need_disambiguation(prev, m):
-                    parts.append('_00' + m)
-                else:
-                    parts.append('_' + m)
-        prev = comp
-
-    return prefix + ''.join(parts)
-
-
-# ---------------------------------------------------------------------------
-# Name demangling
-# ---------------------------------------------------------------------------
-
-def demangle_body(s):
-    """
-    Demangle a string produced by Name.mangleAux (without prefix).
-    Returns a list of components (str or int).
-
-    This is a faithful port of Lean's Name.demangleAux from NameMangling.lean.
-    """
-    components = []
-    length = len(s)
-
-    def emit(comp):
-        components.append(comp)
-
-    def decode_num(pos, n):
-        """Parse remaining digits, emit numeric component, continue."""
-        while pos < length:
-            ch = s[pos]
-            if '0' <= ch <= '9':
-                n = n * 10 + (ord(ch) - ord('0'))
-                pos += 1
-            else:
-                # Expect '_' (trailing underscore of numeric encoding)
-                pos += 1  # skip '_'
-                emit(n)
-                if pos >= length:
-                    return pos
-                # Skip separator '_' and go to name_start
-                pos += 1
-                return name_start(pos)
-        # End of string
-        emit(n)
-        return pos
-
-    def name_start(pos):
-        """Start parsing a new name component."""
-        if pos >= length:
-            return pos
-        ch = s[pos]
-        pos += 1
-        if '0' <= ch <= '9':
-            # Check for '00' disambiguation
-            if ch == '0' and pos < length and s[pos] == '0':
-                pos += 1
-                return demangle_main(pos, "", 0)
-            else:
-                return decode_num(pos, ord(ch) - ord('0'))
-        elif ch == '_':
-            return demangle_main(pos, "", 1)
-        else:
-            return demangle_main(pos, ch, 0)
-
-    def demangle_main(pos, acc, ucount):
-        """Main demangling loop."""
-        while pos < length:
-            ch = s[pos]
-            pos += 1
-
-            if ch == '_':
-                ucount += 1
-                continue
-
-            if ucount % 2 == 0:
-                # Even underscores: literal underscores in component name
-                acc += '_' * (ucount // 2) + ch
-                ucount = 0
-                continue
-
-            # Odd ucount: separator or escape
-            if '0' <= ch <= '9':
-                # End current str component, start number
-                emit(acc + '_' * (ucount // 2))
-                if ch == '0' and pos < length and s[pos] == '0':
-                    pos += 1
-                    return demangle_main(pos, "", 0)
-                else:
-                    return decode_num(pos, ord(ch) - ord('0'))
-
-            # Try hex escapes
-            if ch == 'x':
-                result = _parse_hex(s, pos, 2)
-                if result is not None:
-                    new_pos, val = result
-                    acc += '_' * (ucount // 2) + chr(val)
-                    pos = new_pos
-                    ucount = 0
-                    continue
-
-            if ch == 'u':
-                result = _parse_hex(s, pos, 4)
-                if result is not None:
-                    new_pos, val = result
-                    acc += '_' * (ucount // 2) + chr(val)
-                    pos = new_pos
-                    ucount = 0
-                    continue
-
-            if ch == 'U':
-                result = _parse_hex(s, pos, 8)
-                if result is not None:
-                    new_pos, val = result
-                    acc += '_' * (ucount // 2) + chr(val)
-                    pos = new_pos
-                    ucount = 0
-                    continue
-
-            # Name separator
-            emit(acc)
-            acc = '_' * (ucount // 2) + ch
-            ucount = 0
-
-        # End of string
-        acc += '_' * (ucount // 2)
-        if acc:
-            emit(acc)
-        return pos
-
-    name_start(0)
-    return components
-
-
-# ---------------------------------------------------------------------------
-# Prefix handling for lp_ (package prefix)
-# ---------------------------------------------------------------------------
-
-def _is_valid_string_mangle(s):
-    """Check if s is a valid output of String.mangle (no trailing bare _)."""
-    pos = 0
-    length = len(s)
-    while pos < length:
-        ch = s[pos]
-        if _is_ascii_alnum(ch):
-            pos += 1
-        elif ch == '_':
-            if pos + 1 >= length:
-                return False  # trailing bare _
-            nch = s[pos + 1]
-            if nch == '_':
-                pos += 2
-            elif nch == 'x' and _parse_hex(s, pos + 2, 2) is not None:
-                pos = _parse_hex(s, pos + 2, 2)[0]
-            elif nch == 'u' and _parse_hex(s, pos + 2, 4) is not None:
-                pos = _parse_hex(s, pos + 2, 4)[0]
-            elif nch == 'U' and _parse_hex(s, pos + 2, 8) is not None:
-                pos = _parse_hex(s, pos + 2, 8)[0]
-            else:
-                return False
-        else:
-            return False
-    return True
-
-
-def _skip_string_mangle(s, pos):
-    """
-    Skip past a String.mangle output in s starting at pos.
-    Returns the position after the mangled string (where we expect the separator '_').
-    This is a greedy scan.
-    """
-    length = len(s)
-    while pos < length:
-        ch = s[pos]
-        if _is_ascii_alnum(ch):
-            pos += 1
-        elif ch == '_':
-            if pos + 1 < length:
-                nch = s[pos + 1]
-                if nch == '_':
-                    pos += 2
-                elif nch == 'x' and _parse_hex(s, pos + 2, 2) is not None:
-                    pos = _parse_hex(s, pos + 2, 2)[0]
-                elif nch == 'u' and _parse_hex(s, pos + 2, 4) is not None:
-                    pos = _parse_hex(s, pos + 2, 4)[0]
-                elif nch == 'U' and _parse_hex(s, pos + 2, 8) is not None:
-                    pos = _parse_hex(s, pos + 2, 8)[0]
-                else:
-                    return pos  # bare '_': separator
-            else:
-                return pos
-        else:
-            return pos
-    return pos
-
-
-def _find_lp_body(s):
-    """
-    Given s = everything after 'lp_' in a symbol, find where the declaration
-    body (Name.mangleAux output) starts.
-    Returns the start index of the body within s, or None.
-
-    Strategy: try all candidate split points where the package part is a valid
-    String.mangle output and the body round-trips. Prefer the longest valid
-    package name (most specific match).
-    """
-    length = len(s)
-
-    # Collect candidate split positions: every '_' that could be the separator
-    candidates = []
-    pos = 0
-    while pos < length:
-        if s[pos] == '_':
-            candidates.append(pos)
-        pos += 1
-
-    # Try each candidate; collect all valid splits
-    valid_splits = []
-    for split_pos in candidates:
-        pkg_part = s[:split_pos]
-        if not pkg_part:
-            continue
-        if not _is_valid_string_mangle(pkg_part):
-            continue
-        body = s[split_pos + 1:]
-        if not body:
-            continue
-        components = demangle_body(body)
-        if not components:
-            continue
-        remangled = mangle_name(components, prefix="")
-        if remangled == body:
-            first = components[0]
-            # Score: prefer first component starting with uppercase
-            has_upper = isinstance(first, str) and first and first[0].isupper()
-            valid_splits.append((split_pos, has_upper))
-
-    if valid_splits:
-        # Among splits where first decl component starts uppercase, pick longest pkg.
-        # Otherwise pick shortest pkg.
-        upper_splits = [s for s in valid_splits if s[1]]
-        if upper_splits:
-            best = max(upper_splits, key=lambda x: x[0])
-        else:
-            best = min(valid_splits, key=lambda x: x[0])
-        return best[0] + 1
-
-    # Fallback: greedy String.mangle scan
-    greedy_pos = _skip_string_mangle(s, 0)
-    if greedy_pos < length and s[greedy_pos] == '_':
-        return greedy_pos + 1
-
-    return None
-
-
-# ---------------------------------------------------------------------------
-# Format name components for display
-# ---------------------------------------------------------------------------
-
-def format_name(components):
-    """Format a list of name components as a dot-separated string."""
-    return '.'.join(str(c) for c in components)
-
-
-# ---------------------------------------------------------------------------
-# Human-friendly postprocessing
-# ---------------------------------------------------------------------------
-
-# Compiler-generated suffix components — exact match
-_SUFFIX_FLAGS_EXACT = {
-    '_redArg':  'arity\u2193',
-    '_boxed':   'boxed',
-    '_impl':    'impl',
-}
-
-# Compiler-generated suffix prefixes — match with optional _N index
-# e.g., _lam, _lam_0, _lam_3, _lambda_0, _closed_2
-_SUFFIX_FLAGS_PREFIX = {
-    '_lam':     '\u03bb',
-    '_lambda':  '\u03bb',
-    '_elam':    '\u03bb',
-    '_jp':      'jp',
-    '_closed':  'closed',
-}
-
-
-def _match_suffix(component):
-    """
-    Check if a string component is a compiler-generated suffix.
-    Returns the flag label or None.
-
-    Handles both exact matches (_redArg, _boxed) and indexed suffixes
-    (_lam_0, _lambda_2, _closed_0) produced by appendIndexAfter.
-    """
-    if not isinstance(component, str):
-        return None
-    if component in _SUFFIX_FLAGS_EXACT:
-        return _SUFFIX_FLAGS_EXACT[component]
-    if component in _SUFFIX_FLAGS_PREFIX:
-        return _SUFFIX_FLAGS_PREFIX[component]
-    # Check for indexed suffix: prefix + _N
-    for prefix, label in _SUFFIX_FLAGS_PREFIX.items():
-        if component.startswith(prefix + '_'):
-            rest = component[len(prefix) + 1:]
-            if rest.isdigit():
-                return label
-    return None
-
-
-def _strip_private(components):
-    """Strip _private.Module.0. prefix. Returns (stripped_parts, is_private)."""
-    if (len(components) >= 3 and isinstance(components[0], str) and
-            components[0] == '_private'):
-        for i in range(1, len(components)):
-            if components[i] == 0:
-                if i + 1 < len(components):
-                    return components[i + 1:], True
-                break
-    return components, False
-
-
-def _strip_spec_suffixes(components):
-    """Strip trailing spec_N components (from appendIndexAfter)."""
-    parts = list(components)
-    while parts and isinstance(parts[-1], str) and parts[-1].startswith('spec_'):
-        rest = parts[-1][5:]
-        if rest.isdigit():
-            parts.pop()
-        else:
-            break
-    return parts
-
-
-def _is_spec_index(component):
-    """Check if a component is a spec_N index (from appendIndexAfter)."""
-    return (isinstance(component, str) and
-            component.startswith('spec_') and component[5:].isdigit())
-
-
-def _parse_spec_entries(rest):
-    """Parse _at_..._spec pairs into separate spec context entries.
-
-    Given components starting from the first _at_, returns:
-    - entries: list of component lists, one per _at_..._spec block
-    - remaining: components after the last _spec N (trailing suffixes)
-    """
-    entries = []
-    current_ctx = None
-    remaining = []
-    skip_next = False
-
-    for p in rest:
-        if skip_next:
-            skip_next = False
-            continue
-        if isinstance(p, str) and p == '_at_':
-            if current_ctx is not None:
-                entries.append(current_ctx)
-            current_ctx = []
-            continue
-        if isinstance(p, str) and p == '_spec':
-            if current_ctx is not None:
-                entries.append(current_ctx)
-                current_ctx = None
-            skip_next = True
-            continue
-        if isinstance(p, str) and p.startswith('_spec'):
-            if current_ctx is not None:
-                entries.append(current_ctx)
-                current_ctx = None
-            continue
-        if current_ctx is not None:
-            current_ctx.append(p)
-        else:
-            remaining.append(p)
-
-    if current_ctx is not None:
-        entries.append(current_ctx)
-
-    return entries, remaining
-
-
-def _process_spec_context(components):
-    """Process a spec context into a clean name and its flags.
-
-    Returns (name_parts, flags) where name_parts are the cleaned components
-    and flags is a deduplicated list of flag labels from compiler suffixes.
-    """
-    parts = list(components)
-    parts, _ = _strip_private(parts)
-
-    name_parts = []
-    ctx_flags = []
-    seen = set()
-
-    for p in parts:
-        flag = _match_suffix(p)
-        if flag is not None:
-            if flag not in seen:
-                ctx_flags.append(flag)
-                seen.add(flag)
-        elif _is_spec_index(p):
-            pass
-        else:
-            name_parts.append(p)
-
-    return name_parts, ctx_flags
-
-
-def postprocess_name(components):
-    """
-    Transform raw demangled components into a human-friendly display string.
-
-    Applies:
-    - Private name cleanup: _private.Module.0.Name.foo -> Name.foo [private]
-    - Hygienic name cleanup: strips _@.module._hygCtx._hyg.N
-    - Suffix folding: _redArg, _boxed, _lam_0, etc. -> [flags]
-    - Specialization: f._at_.g._spec.N -> f spec at g
-      Shown after base [flags], with context flags: spec at g[ctx_flags]
-    """
-    if not components:
-        return ""
-
-    parts = list(components)
-    flags = []
-    spec_entries = []
-
-    # --- Strip _private prefix ---
-    parts, is_private = _strip_private(parts)
-
-    # --- Strip hygienic suffixes: everything from _@ onward ---
-    at_idx = None
-    for i, p in enumerate(parts):
-        if isinstance(p, str) and p.startswith('_@'):
-            at_idx = i
-            break
-    if at_idx is not None:
-        parts = parts[:at_idx]
-
-    # --- Handle specialization: _at_ ... _spec N ---
-    at_positions = [i for i, p in enumerate(parts)
-                    if isinstance(p, str) and p == '_at_']
-    if at_positions:
-        first_at = at_positions[0]
-        base = parts[:first_at]
-        rest = parts[first_at:]
-
-        entries, remaining = _parse_spec_entries(rest)
-        for ctx_components in entries:
-            ctx_name, ctx_flags = _process_spec_context(ctx_components)
-            if ctx_name or ctx_flags:
-                spec_entries.append((ctx_name, ctx_flags))
-
-        parts = base + remaining
-
-    # --- Collect suffix flags from the end ---
-    while parts:
-        last = parts[-1]
-        flag = _match_suffix(last)
-        if flag is not None:
-            flags.append(flag)
-            parts.pop()
-        elif isinstance(last, int) and len(parts) >= 2:
-            prev_flag = _match_suffix(parts[-2])
-            if prev_flag is not None:
-                flags.append(prev_flag)
-                parts.pop()  # remove the number
-                parts.pop()  # remove the suffix
-            else:
-                break
-        else:
-            break
-
-    if is_private:
-        flags.append('private')
-
-    # --- Format result ---
-    name = '.'.join(str(c) for c in parts) if parts else '?'
-    result = name
-    if flags:
-        flag_str = ', '.join(flags)
-        result += f' [{flag_str}]'
-
-    for ctx_name, ctx_flags in spec_entries:
-        ctx_str = '.'.join(str(c) for c in ctx_name) if ctx_name else '?'
-        if ctx_flags:
-            ctx_flag_str = ', '.join(ctx_flags)
-            result += f' spec at {ctx_str}[{ctx_flag_str}]'
-        else:
-            result += f' spec at {ctx_str}'
-
-    return result
-
-
-# ---------------------------------------------------------------------------
-# Main demangling entry point
-# ---------------------------------------------------------------------------
-
-def demangle_lean_name_raw(mangled):
-    """
-    Demangle a Lean C symbol, preserving all internal name components.
-
-    Returns the exact demangled name with all compiler-generated suffixes
-    intact. Use demangle_lean_name() for human-friendly output.
-    """
-    try:
-        return _demangle_lean_name_inner(mangled, human_friendly=False)
-    except Exception:
-        return mangled
+_process = None
+_script_dir = os.path.dirname(os.path.abspath(__file__))
+_cli_script = os.path.join(_script_dir, "lean_demangle_cli.lean")
+
+
+def _get_process():
+    """Get or create the persistent Lean demangler subprocess."""
+    global _process
+    if _process is not None and _process.poll() is None:
+        return _process
+
+    lean = os.environ.get("LEAN", "lean")
+    _process = subprocess.Popen(
+        [lean, "--run", _cli_script],
+        stdin=subprocess.PIPE,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.DEVNULL,
+        text=True,
+        bufsize=1,  # line buffered
+    )
+    atexit.register(_cleanup)
+    return _process
+
+
+def _cleanup():
+    global _process
+    if _process is not None:
+        try:
+            _process.stdin.close()
+            _process.wait(timeout=5)
+        except Exception:
+            _process.kill()
+        _process = None


 def demangle_lean_name(mangled):
    """
    Demangle a C symbol name produced by the Lean 4 compiler.

-    Returns a human-friendly demangled name with compiler suffixes folded
-    into readable flags. Use demangle_lean_name_raw() to preserve all
-    internal components.
+    Returns a human-friendly demangled name, or the original string
+    if it is not a Lean symbol.
    """
    try:
-        return _demangle_lean_name_inner(mangled, human_friendly=True)
+        proc = _get_process()
+        proc.stdin.write(mangled + "\n")
+        proc.stdin.flush()
+        result = proc.stdout.readline().rstrip("\n")
+        return result if result else mangled
    except Exception:
        return mangled


-def _demangle_lean_name_inner(mangled, human_friendly=True):
-    """Inner demangle that may raise on malformed input."""
-
-    if mangled == "_lean_main":
-        return "[lean] main"
-
-    # Handle lean_ runtime functions
-    if human_friendly and mangled.startswith("lean_apply_"):
-        rest = mangled[11:]
-        if rest.isdigit():
-            return f"<apply/{rest}>"
-
-    # Strip .cold.N suffix (LLVM linker cold function clones)
-    cold_suffix = ""
-    core = mangled
-    dot_pos = core.find('.cold.')
-    if dot_pos >= 0:
-        cold_suffix = " " + core[dot_pos:]
-        core = core[:dot_pos]
-    elif core.endswith('.cold'):
-        cold_suffix = " .cold"
-        core = core[:-5]
-
-    result = _demangle_core(core, human_friendly)
-    if result is None:
-        return mangled
-    return result + cold_suffix
-
-
-def _demangle_core(mangled, human_friendly=True):
-    """Demangle a symbol without .cold suffix. Returns None if not a Lean name."""
-
-    fmt = postprocess_name if human_friendly else format_name
-
-    # _init_ prefix
-    if mangled.startswith("_init_"):
-        rest = mangled[6:]
-        body, pkg_display = _strip_lean_prefix(rest)
-        if body is None:
-            return None
-        components = demangle_body(body)
-        if not components:
-            return None
-        name = fmt(components)
-        if pkg_display:
-            return f"[init] {name} ({pkg_display})"
-        return f"[init] {name}"
-
-    # initialize_ prefix (module init functions)
-    if mangled.startswith("initialize_"):
-        rest = mangled[11:]
-        # With package: initialize_lp_{pkg}_{body} or initialize_l_{body}
-        body, pkg_display = _strip_lean_prefix(rest)
-        if body is not None:
-            components = demangle_body(body)
-            if components:
-                name = fmt(components)
-                if pkg_display:
-                    return f"[module_init] {name} ({pkg_display})"
-                return f"[module_init] {name}"
-        # Without package: initialize_{Name.mangleAux(moduleName)}
-        if rest:
-            components = demangle_body(rest)
-            if components:
-                return f"[module_init] {fmt(components)}"
-        return None
-
-    # l_ or lp_ prefix
-    body, pkg_display = _strip_lean_prefix(mangled)
-    if body is None:
-        return None
-    components = demangle_body(body)
-    if not components:
-        return None
-    name = fmt(components)
-    if pkg_display:
-        return f"{name} ({pkg_display})"
-    return name
-
-
-def _strip_lean_prefix(s):
-    """
-    Strip the l_ or lp_ prefix from a mangled symbol.
-    Returns (body, pkg_display) where body is the Name.mangleAux output
-    and pkg_display is None or a string describing the package.
-    Returns (None, None) if the string doesn't have a recognized prefix.
-    """
-    if s.startswith("l_"):
-        return (s[2:], None)
-
-    if s.startswith("lp_"):
-        after_lp = s[3:]
-        body_start = _find_lp_body(after_lp)
-        if body_start is not None:
-            pkg_mangled = after_lp[:body_start - 1]
-            # Unmangle the package name
-            pkg_components = demangle_body(pkg_mangled)
-            if pkg_components and len(pkg_components) == 1 and isinstance(pkg_components[0], str):
-                pkg_display = pkg_components[0]
-            else:
-                pkg_display = pkg_mangled
-            return (after_lp[body_start:], pkg_display)
-        # Fallback: treat everything after lp_ as body
-        return (after_lp, "?")
-
-    return (None, None)
-
-
-# ---------------------------------------------------------------------------
-# CLI
-# ---------------------------------------------------------------------------
-
 def main():
-    """Filter stdin or arguments, demangling Lean names."""
-    import argparse
-    parser = argparse.ArgumentParser(
-        description="Demangle Lean 4 C symbol names (like c++filt for Lean)")
-    parser.add_argument('names', nargs='*',
-                        help='Names to demangle (reads stdin if none given)')
-    parser.add_argument('--raw', action='store_true',
-                        help='Output exact demangled names without postprocessing')
-    args = parser.parse_args()
-
-    demangle = demangle_lean_name_raw if args.raw else demangle_lean_name
-
-    if args.names:
-        for name in args.names:
-            print(demangle(name))
-    else:
-        for line in sys.stdin:
-            print(demangle(line.rstrip('\n')))
+    """Filter stdin, demangling Lean names."""
+    for line in sys.stdin:
+        print(demangle_lean_name(line.rstrip("\n")))


-if __name__ == '__main__':
+if __name__ == "__main__":
    main()
--- a/script/profiler/lean_demangle_cli.lean
+++ b/script/profiler/lean_demangle_cli.lean
@@ -0,0 +1,32 @@
+/-
+Copyright (c) 2026 Lean FRO, LLC. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Kim Morrison
+-/
+module
+
+import Lean.Compiler.NameDemangling
+
+/-!
+Lean name demangler CLI tool. Reads mangled symbol names from stdin (one per
+line) and writes demangled names to stdout. Non-Lean symbols pass through
+unchanged. Like `c++filt` but for Lean names.
+
+Usage:
+    echo "l_Lean_Meta_foo" | lean --run lean_demangle_cli.lean
+    cat symbols.txt | lean --run lean_demangle_cli.lean
+-/
+
+open Lean.Name.Demangle
+
+def main : IO Unit := do
+  let stdin ← IO.getStdin
+  let stdout ← IO.getStdout
+  repeat do
+    let line ← stdin.getLine
+    if line.isEmpty then break
+    let sym := line.trimRight
+    match demangleSymbol sym with
+    | some s => stdout.putStrLn s
+    | none => stdout.putStrLn sym
+    stdout.flush
--- a/script/profiler/test_demangle.py
+++ b/script/profiler/test_demangle.py
@@ -1,670 +0,0 @@
-#!/usr/bin/env python3
-"""Tests for the Lean name demangler."""
-
-import unittest
-import json
-import gzip
-import tempfile
-import os
-
-from lean_demangle import (
-    mangle_string, mangle_name, demangle_body, format_name,
-    demangle_lean_name, demangle_lean_name_raw, postprocess_name,
-    _parse_hex, _check_disambiguation,
-)
-
-
-class TestStringMangle(unittest.TestCase):
-    """Test String.mangle (character-level escaping)."""
-
-    def test_alphanumeric(self):
-        self.assertEqual(mangle_string("hello"), "hello")
-        self.assertEqual(mangle_string("abc123"), "abc123")
-
-    def test_underscore(self):
-        self.assertEqual(mangle_string("a_b"), "a__b")
-        self.assertEqual(mangle_string("_"), "__")
-        self.assertEqual(mangle_string("__"), "____")
-
-    def test_special_chars(self):
-        self.assertEqual(mangle_string("."), "_x2e")
-        self.assertEqual(mangle_string("a.b"), "a_x2eb")
-
-    def test_unicode(self):
-        self.assertEqual(mangle_string("\u03bb"), "_u03bb")
-        self.assertEqual(mangle_string("\U0001d55c"), "_U0001d55c")
-
-    def test_empty(self):
-        self.assertEqual(mangle_string(""), "")
-
-
-class TestNameMangle(unittest.TestCase):
-    """Test Name.mangle (hierarchical name mangling)."""
-
-    def test_simple(self):
-        self.assertEqual(mangle_name(["Lean", "Meta", "Sym", "main"]),
-                         "l_Lean_Meta_Sym_main")
-
-    def test_single_component(self):
-        self.assertEqual(mangle_name(["main"]), "l_main")
-
-    def test_numeric_component(self):
-        self.assertEqual(
-            mangle_name(["_private", "Lean", "Meta", "Basic", 0,
-                         "Lean", "Meta", "withMVarContextImp"]),
-            "l___private_Lean_Meta_Basic_0__Lean_Meta_withMVarContextImp")
-
-    def test_component_with_underscore(self):
-        self.assertEqual(mangle_name(["a_b"]), "l_a__b")
-        self.assertEqual(mangle_name(["a_b", "c"]), "l_a__b_c")
-
-    def test_disambiguation_digit_start(self):
-        self.assertEqual(mangle_name(["0foo"]), "l_000foo")
-
-    def test_disambiguation_escape_start(self):
-        self.assertEqual(mangle_name(["a", "x27"]), "l_a_00x27")
-
-    def test_numeric_root(self):
-        self.assertEqual(mangle_name([42]), "l_42_")
-        self.assertEqual(mangle_name([42, "foo"]), "l_42__foo")
-
-    def test_component_ending_with_underscore(self):
-        self.assertEqual(mangle_name(["a_", "b"]), "l_a___00b")
-
-    def test_custom_prefix(self):
-        self.assertEqual(mangle_name(["foo"], prefix="lp_pkg_"),
-                         "lp_pkg_foo")
-
-
-class TestDemangleBody(unittest.TestCase):
-    """Test demangle_body (the core Name.demangleAux algorithm)."""
-
-    def test_simple(self):
-        self.assertEqual(demangle_body("Lean_Meta_Sym_main"),
-                         ["Lean", "Meta", "Sym", "main"])
-
-    def test_single(self):
-        self.assertEqual(demangle_body("main"), ["main"])
-
-    def test_empty(self):
-        self.assertEqual(demangle_body(""), [])
-
-    def test_underscore_in_component(self):
-        self.assertEqual(demangle_body("a__b"), ["a_b"])
-        self.assertEqual(demangle_body("a__b_c"), ["a_b", "c"])
-
-    def test_numeric_component(self):
-        self.assertEqual(demangle_body("foo_42__bar"), ["foo", 42, "bar"])
-
-    def test_numeric_root(self):
-        self.assertEqual(demangle_body("42_"), [42])
-
-    def test_numeric_at_end(self):
-        self.assertEqual(demangle_body("foo_42_"), ["foo", 42])
-
-    def test_disambiguation_00(self):
-        self.assertEqual(demangle_body("a_00x27"), ["a", "x27"])
-
-    def test_disambiguation_00_at_root(self):
-        self.assertEqual(demangle_body("000foo"), ["0foo"])
-
-    def test_hex_escape_x(self):
-        self.assertEqual(demangle_body("a_x2eb"), ["a.b"])
-
-    def test_hex_escape_u(self):
-        self.assertEqual(demangle_body("_u03bb"), ["\u03bb"])
-
-    def test_hex_escape_U(self):
-        self.assertEqual(demangle_body("_U0001d55c"), ["\U0001d55c"])
-
-    def test_private_name(self):
-        body = "__private_Lean_Meta_Basic_0__Lean_Meta_withMVarContextImp"
-        self.assertEqual(demangle_body(body),
-                         ["_private", "Lean", "Meta", "Basic", 0,
-                          "Lean", "Meta", "withMVarContextImp"])
-
-    def test_boxed_suffix(self):
-        body = "foo___boxed"
-        self.assertEqual(demangle_body(body), ["foo", "_boxed"])
-
-    def test_redArg_suffix(self):
-        body = "foo_bar___redArg"
-        self.assertEqual(demangle_body(body), ["foo", "bar", "_redArg"])
-
-    def test_component_ending_underscore_disambiguation(self):
-        self.assertEqual(demangle_body("a___00b"), ["a_", "b"])
-
-
-class TestRoundTrip(unittest.TestCase):
-    """Test that mangle(demangle(x)) == x for various names."""
-
-    def _check_roundtrip(self, components):
-        mangled = mangle_name(components, prefix="")
-        demangled = demangle_body(mangled)
-        self.assertEqual(demangled, components,
-                         f"Round-trip failed: {components} -> '{mangled}' -> {demangled}")
-        mangled_with_prefix = mangle_name(components, prefix="l_")
-        self.assertTrue(mangled_with_prefix.startswith("l_"))
-        body = mangled_with_prefix[2:]
-        demangled2 = demangle_body(body)
-        self.assertEqual(demangled2, components)
-
-    def test_simple_names(self):
-        self._check_roundtrip(["Lean", "Meta", "main"])
-        self._check_roundtrip(["a"])
-        self._check_roundtrip(["Foo", "Bar", "baz"])
-
-    def test_numeric(self):
-        self._check_roundtrip(["foo", 0, "bar"])
-        self._check_roundtrip([42])
-        self._check_roundtrip(["a", 1, "b", 2, "c"])
-
-    def test_underscores(self):
-        self._check_roundtrip(["_private"])
-        self._check_roundtrip(["a_b", "c_d"])
-        self._check_roundtrip(["_at_", "_spec"])
-
-    def test_private_name(self):
-        self._check_roundtrip(["_private", "Lean", "Meta", "Basic", 0,
-                                "Lean", "Meta", "withMVarContextImp"])
-
-    def test_boxed(self):
-        self._check_roundtrip(["Lean", "Meta", "foo", "_boxed"])
-
-    def test_redArg(self):
-        self._check_roundtrip(["Lean", "Meta", "foo", "_redArg"])
-
-    def test_specialization(self):
-        self._check_roundtrip(["List", "map", "_at_", "Foo", "bar", "_spec", 3])
-
-    def test_lambda(self):
-        self._check_roundtrip(["Foo", "bar", "_lambda", 0])
-        self._check_roundtrip(["Foo", "bar", "_lambda", 2])
-
-    def test_closed(self):
-        self._check_roundtrip(["myConst", "_closed", 0])
-
-    def test_special_chars(self):
-        self._check_roundtrip(["a.b"])
-        self._check_roundtrip(["\u03bb"])
-        self._check_roundtrip(["a", "b\u2192c"])
-
-    def test_disambiguation_cases(self):
-        self._check_roundtrip(["a", "x27"])
-        self._check_roundtrip(["0foo"])
-        self._check_roundtrip(["a_", "b"])
-
-    def test_complex_real_names(self):
-        """Names modeled after real Lean compiler output."""
-        self._check_roundtrip(
-            ["Lean", "MVarId", "withContext", "_at_",
-             "_private", "Lean", "Meta", "Sym", 0,
-             "Lean", "Meta", "Sym", "BackwardRule", "apply",
-             "_spec", 2, "_redArg", "_lambda", 0, "_boxed"])
-
-
-class TestDemangleRaw(unittest.TestCase):
-    """Test demangle_lean_name_raw (exact demangling, no postprocessing)."""
-
-    def test_l_prefix(self):
-        self.assertEqual(
-            demangle_lean_name_raw("l_Lean_Meta_Sym_main"),
-            "Lean.Meta.Sym.main")
-
-    def test_l_prefix_private(self):
-        result = demangle_lean_name_raw(
-            "l___private_Lean_Meta_Basic_0__Lean_Meta_withMVarContextImp")
-        self.assertEqual(result,
-                         "_private.Lean.Meta.Basic.0.Lean.Meta.withMVarContextImp")
-
-    def test_l_prefix_boxed(self):
-        result = demangle_lean_name_raw("l_foo___boxed")
-        self.assertEqual(result, "foo._boxed")
-
-    def test_l_prefix_redArg(self):
-        result = demangle_lean_name_raw(
-            "l___private_Lean_Meta_Basic_0__Lean_Meta_withMVarContextImp___redArg")
-        self.assertEqual(
-            result,
-            "_private.Lean.Meta.Basic.0.Lean.Meta.withMVarContextImp._redArg")
-
-    def test_lean_main(self):
-        self.assertEqual(demangle_lean_name_raw("_lean_main"), "[lean] main")
-
-    def test_non_lean_names(self):
-        self.assertEqual(demangle_lean_name_raw("printf"), "printf")
-        self.assertEqual(demangle_lean_name_raw("malloc"), "malloc")
-        self.assertEqual(demangle_lean_name_raw("lean_apply_5"), "lean_apply_5")
-        self.assertEqual(demangle_lean_name_raw(""), "")
-
-    def test_init_prefix(self):
-        result = demangle_lean_name_raw("_init_l_Lean_Meta_foo")
-        self.assertEqual(result, "[init] Lean.Meta.foo")
-
-    def test_lp_prefix_simple(self):
-        mangled = mangle_name(["Lean", "Meta", "foo"], prefix="lp_std_")
-        self.assertEqual(mangled, "lp_std_Lean_Meta_foo")
-        result = demangle_lean_name_raw(mangled)
-        self.assertEqual(result, "Lean.Meta.foo (std)")
-
-    def test_lp_prefix_underscore_pkg(self):
-        pkg_mangled = mangle_string("my_pkg")
-        self.assertEqual(pkg_mangled, "my__pkg")
-        mangled = mangle_name(["Lean", "Meta", "foo"],
-                              prefix=f"lp_{pkg_mangled}_")
-        self.assertEqual(mangled, "lp_my__pkg_Lean_Meta_foo")
-        result = demangle_lean_name_raw(mangled)
-        self.assertEqual(result, "Lean.Meta.foo (my_pkg)")
-
-    def test_lp_prefix_private_decl(self):
-        mangled = mangle_name(
-            ["_private", "X", 0, "Y", "foo"], prefix="lp_pkg_")
-        self.assertEqual(mangled, "lp_pkg___private_X_0__Y_foo")
-        result = demangle_lean_name_raw(mangled)
-        self.assertEqual(result, "_private.X.0.Y.foo (pkg)")
-
-    def test_complex_specialization(self):
-        components = [
-            "Lean", "MVarId", "withContext", "_at_",
-            "_private", "Lean", "Meta", "Sym", 0,
-            "Lean", "Meta", "Sym", "BackwardRule", "apply",
-            "_spec", 2, "_redArg", "_lambda", 0, "_boxed"
-        ]
-        mangled = mangle_name(components)
-        result = demangle_lean_name_raw(mangled)
-        expected = format_name(components)
-        self.assertEqual(result, expected)
-
-    def test_cold_suffix(self):
-        result = demangle_lean_name_raw("l_Lean_Meta_foo___redArg.cold.1")
-        self.assertEqual(result, "Lean.Meta.foo._redArg .cold.1")
-
-    def test_cold_suffix_plain(self):
-        result = demangle_lean_name_raw("l_Lean_Meta_foo.cold")
-        self.assertEqual(result, "Lean.Meta.foo .cold")
-
-    def test_initialize_no_pkg(self):
-        result = demangle_lean_name_raw("initialize_Init_Control_Basic")
-        self.assertEqual(result, "[module_init] Init.Control.Basic")
-
-    def test_initialize_with_l_prefix(self):
-        result = demangle_lean_name_raw("initialize_l_Lean_Meta_foo")
-        self.assertEqual(result, "[module_init] Lean.Meta.foo")
-
-    def test_never_crashes(self):
-        """Demangling should never raise, just return the original."""
-        weird_inputs = [
-            "", "l_", "lp_", "lp_x", "_init_", "initialize_",
-            "l_____", "lp____", "l_00", "l_0",
-            "some random string", "l_ space",
-        ]
-        for inp in weird_inputs:
-            result = demangle_lean_name_raw(inp)
-            self.assertIsInstance(result, str)
-
-
-class TestPostprocess(unittest.TestCase):
-    """Test postprocess_name (human-friendly suffix folding, etc.)."""
-
-    def test_no_change(self):
-        self.assertEqual(postprocess_name(["Lean", "Meta", "main"]),
-                         "Lean.Meta.main")
-
-    def test_boxed(self):
-        self.assertEqual(postprocess_name(["foo", "_boxed"]),
-                         "foo [boxed]")
-
-    def test_redArg(self):
-        self.assertEqual(postprocess_name(["foo", "bar", "_redArg"]),
-                         "foo.bar [arity\u2193]")
-
-    def test_lambda_separate(self):
-        # _lam as separate component + numeric index
-        self.assertEqual(postprocess_name(["foo", "_lam", 0]),
-                         "foo [\u03bb]")
-
-    def test_lambda_indexed(self):
-        # _lam_0 as single string (appendIndexAfter)
-        self.assertEqual(postprocess_name(["foo", "_lam_0"]),
-                         "foo [\u03bb]")
-        self.assertEqual(postprocess_name(["foo", "_lambda_2"]),
-                         "foo [\u03bb]")
-
-    def test_lambda_boxed(self):
-        # _lam_0 followed by _boxed
-        self.assertEqual(
-            postprocess_name(["Lean", "Meta", "Simp", "simpLambda",
-                              "_lam_0", "_boxed"]),
-            "Lean.Meta.Simp.simpLambda [boxed, \u03bb]")
-
-    def test_closed(self):
-        self.assertEqual(postprocess_name(["myConst", "_closed", 3]),
-                         "myConst [closed]")
-
-    def test_closed_indexed(self):
-        self.assertEqual(postprocess_name(["myConst", "_closed_0"]),
-                         "myConst [closed]")
-
-    def test_multiple_suffixes(self):
-        self.assertEqual(postprocess_name(["foo", "_redArg", "_boxed"]),
-                         "foo [boxed, arity\u2193]")
-
-    def test_redArg_lam(self):
-        # _redArg followed by _lam_0 (issue #4)
-        self.assertEqual(
-            postprocess_name(["Lean", "profileitIOUnsafe",
-                              "_redArg", "_lam_0"]),
-            "Lean.profileitIOUnsafe [\u03bb, arity\u2193]")
-
-    def test_private_name(self):
-        self.assertEqual(
-            postprocess_name(["_private", "Lean", "Meta", "Basic", 0,
-                              "Lean", "Meta", "withMVarContextImp"]),
-            "Lean.Meta.withMVarContextImp [private]")
-
-    def test_private_with_suffix(self):
-        self.assertEqual(
-            postprocess_name(["_private", "Lean", "Meta", "Basic", 0,
-                              "Lean", "Meta", "foo", "_redArg"]),
-            "Lean.Meta.foo [arity\u2193, private]")
-
-    def test_hygienic_strip(self):
-        self.assertEqual(
-            postprocess_name(["Lean", "Meta", "foo", "_@", "Lean", "Meta",
-                              "_hyg", 42]),
-            "Lean.Meta.foo")
-
-    def test_specialization(self):
-        self.assertEqual(
-            postprocess_name(["List", "map", "_at_", "Foo", "bar",
-                              "_spec", 3]),
-            "List.map spec at Foo.bar")
-
-    def test_specialization_with_suffix(self):
-        # Base suffix _boxed appears in [flags] before spec at
-        self.assertEqual(
-            postprocess_name(["Lean", "MVarId", "withContext", "_at_",
-                              "Foo", "bar", "_spec", 2, "_boxed"]),
-            "Lean.MVarId.withContext [boxed] spec at Foo.bar")
-
-    def test_spec_context_with_flags(self):
-        # Compiler suffixes in spec context become context flags
-        self.assertEqual(
-            postprocess_name(["Lean", "Meta", "foo", "_at_",
-                              "Lean", "Meta", "bar", "_elam_1", "_redArg",
-                              "_spec", 2]),
-            "Lean.Meta.foo spec at Lean.Meta.bar[\u03bb, arity\u2193]")
-
-    def test_spec_context_flags_dedup(self):
-        # Duplicate flag labels are deduplicated
-        self.assertEqual(
-            postprocess_name(["f", "_at_",
-                              "g", "_lam_0", "_elam_1", "_redArg",
-                              "_spec", 1]),
-            "f spec at g[\u03bb, arity\u2193]")
-
-    def test_multiple_at(self):
-        # Multiple _at_ entries become separate spec at clauses
-        self.assertEqual(
-            postprocess_name(["f", "_at_", "g", "_spec", 1,
-                              "_at_", "h", "_spec", 2]),
-            "f spec at g spec at h")
-
-    def test_multiple_at_with_flags(self):
-        # Multiple spec at with flags on base and contexts
-        self.assertEqual(
-            postprocess_name(["f", "_at_", "g", "_redArg", "_spec", 1,
-                              "_at_", "h", "_lam_0", "_spec", 2,
-                              "_boxed"]),
-            "f [boxed] spec at g[arity\u2193] spec at h[\u03bb]")
-
-    def test_base_flags_before_spec(self):
-        # Base trailing suffixes appear in [flags] before spec at
-        self.assertEqual(
-            postprocess_name(["f", "_at_", "g", "_spec", 1, "_lam_0"]),
-            "f [\u03bb] spec at g")
-
-    def test_spec_context_strip_spec_suffixes(self):
-        # spec_0 in context should be stripped
-        self.assertEqual(
-            postprocess_name(["Lean", "Meta", "transformWithCache", "visit",
-                              "_at_",
-                              "_private", "Lean", "Meta", "Transform", 0,
-                              "Lean", "Meta", "transform",
-                              "Lean", "Meta", "Sym", "unfoldReducible",
-                              "spec_0", "spec_0",
-                              "_spec", 1]),
-            "Lean.Meta.transformWithCache.visit "
-            "spec at Lean.Meta.transform.Lean.Meta.Sym.unfoldReducible")
-
-    def test_spec_context_strip_private(self):
-        # _private in spec context should be stripped
-        self.assertEqual(
-            postprocess_name(["Array", "mapMUnsafe", "map", "_at_",
-                              "_private", "Lean", "Meta", "Transform", 0,
-                              "Lean", "Meta", "transformWithCache", "visit",
-                              "_spec", 1]),
-            "Array.mapMUnsafe.map "
-            "spec at Lean.Meta.transformWithCache.visit")
-
-    def test_empty(self):
-        self.assertEqual(postprocess_name([]), "")
-
-
-class TestDemangleHumanFriendly(unittest.TestCase):
-    """Test demangle_lean_name (human-friendly output)."""
-
-    def test_simple(self):
-        self.assertEqual(demangle_lean_name("l_Lean_Meta_main"),
-                         "Lean.Meta.main")
-
-    def test_boxed(self):
-        self.assertEqual(demangle_lean_name("l_foo___boxed"),
-                         "foo [boxed]")
-
-    def test_redArg(self):
-        self.assertEqual(demangle_lean_name("l_foo___redArg"),
-                         "foo [arity\u2193]")
-
-    def test_private(self):
-        self.assertEqual(
-            demangle_lean_name(
-                "l___private_Lean_Meta_Basic_0__Lean_Meta_foo"),
-            "Lean.Meta.foo [private]")
-
-    def test_private_with_redArg(self):
-        self.assertEqual(
-            demangle_lean_name(
-                "l___private_Lean_Meta_Basic_0__Lean_Meta_foo___redArg"),
-            "Lean.Meta.foo [arity\u2193, private]")
-
-    def test_cold_with_suffix(self):
-        self.assertEqual(
-            demangle_lean_name("l_Lean_Meta_foo___redArg.cold.1"),
-            "Lean.Meta.foo [arity\u2193] .cold.1")
-
-    def test_lean_apply(self):
-        self.assertEqual(demangle_lean_name("lean_apply_5"), "<apply/5>")
-        self.assertEqual(demangle_lean_name("lean_apply_12"), "<apply/12>")
-
-    def test_lean_apply_raw_unchanged(self):
-        self.assertEqual(demangle_lean_name_raw("lean_apply_5"),
-                         "lean_apply_5")
-
-    def test_init_private(self):
-        self.assertEqual(
-            demangle_lean_name(
-                "_init_l___private_X_0__Y_foo"),
-            "[init] Y.foo [private]")
-
-    def test_complex_specialization(self):
-        components = [
-            "Lean", "MVarId", "withContext", "_at_",
-            "_private", "Lean", "Meta", "Sym", 0,
-            "Lean", "Meta", "Sym", "BackwardRule", "apply",
-            "_spec", 2, "_redArg", "_lambda", 0, "_boxed"
-        ]
-        mangled = mangle_name(components)
-        result = demangle_lean_name(mangled)
-        # Base: Lean.MVarId.withContext with trailing _redArg, _lambda 0, _boxed
-        # Spec context: Lean.Meta.Sym.BackwardRule.apply (private stripped)
-        self.assertEqual(
-            result,
-            "Lean.MVarId.withContext [boxed, \u03bb, arity\u2193] "
-            "spec at Lean.Meta.Sym.BackwardRule.apply")
-
-    def test_non_lean_unchanged(self):
-        self.assertEqual(demangle_lean_name("printf"), "printf")
-        self.assertEqual(demangle_lean_name("malloc"), "malloc")
-        self.assertEqual(demangle_lean_name(""), "")
-
-
-class TestDemangleProfile(unittest.TestCase):
-    """Test the profile rewriter."""
-
-    def _make_profile_shared(self, strings):
-        """Create a profile with shared.stringArray (newer format)."""
-        return {
-            "meta": {"version": 28},
-            "libs": [],
-            "shared": {
-                "stringArray": list(strings),
-            },
-            "threads": [{
-                "name": "main",
-                "pid": "1",
-                "tid": 1,
-                "funcTable": {
-                    "name": list(range(len(strings))),
-                    "isJS": [False] * len(strings),
-                    "relevantForJS": [False] * len(strings),
-                    "resource": [-1] * len(strings),
-                    "fileName": [None] * len(strings),
-                    "lineNumber": [None] * len(strings),
-                    "columnNumber": [None] * len(strings),
-                    "length": len(strings),
-                },
-                "frameTable": {"length": 0},
-                "stackTable": {"length": 0},
-                "samples": {"length": 0},
-                "markers": {"length": 0},
-                "resourceTable": {"length": 0},
-                "nativeSymbols": {"length": 0},
-            }],
-            "pages": [],
-            "counters": [],
-        }
-
-    def _make_profile_per_thread(self, strings):
-        """Create a profile with per-thread stringArray (samply format)."""
-        return {
-            "meta": {"version": 28},
-            "libs": [],
-            "threads": [{
-                "name": "main",
-                "pid": "1",
-                "tid": 1,
-                "stringArray": list(strings),
-                "funcTable": {
-                    "name": list(range(len(strings))),
-                    "isJS": [False] * len(strings),
-                    "relevantForJS": [False] * len(strings),
-                    "resource": [-1] * len(strings),
-                    "fileName": [None] * len(strings),
-                    "lineNumber": [None] * len(strings),
-                    "columnNumber": [None] * len(strings),
-                    "length": len(strings),
-                },
-                "frameTable": {"length": 0},
-                "stackTable": {"length": 0},
-                "samples": {"length": 0},
-                "markers": {"length": 0},
-                "resourceTable": {"length": 0},
-                "nativeSymbols": {"length": 0},
-            }],
-            "pages": [],
-            "counters": [],
-        }
-
-    def test_profile_rewrite_shared(self):
-        from lean_demangle_profile import rewrite_profile
-        strings = [
-            "l_Lean_Meta_Sym_main",
-            "printf",
-            "lean_apply_5",
-            "l___private_Lean_Meta_Basic_0__Lean_Meta_foo",
-        ]
-        profile = self._make_profile_shared(strings)
-        rewrite_profile(profile)
-        sa = profile["shared"]["stringArray"]
-        self.assertEqual(sa[0], "Lean.Meta.Sym.main")
-        self.assertEqual(sa[1], "printf")
-        self.assertEqual(sa[2], "<apply/5>")
-        self.assertEqual(sa[3], "Lean.Meta.foo [private]")
-
-    def test_profile_rewrite_per_thread(self):
-        from lean_demangle_profile import rewrite_profile
-        strings = [
-            "l_Lean_Meta_Sym_main",
-            "printf",
-            "lean_apply_5",
-            "l___private_Lean_Meta_Basic_0__Lean_Meta_foo",
-        ]
-        profile = self._make_profile_per_thread(strings)
-        count = rewrite_profile(profile)
-        sa = profile["threads"][0]["stringArray"]
-        self.assertEqual(sa[0], "Lean.Meta.Sym.main")
-        self.assertEqual(sa[1], "printf")
-        self.assertEqual(sa[2], "<apply/5>")
-        self.assertEqual(sa[3], "Lean.Meta.foo [private]")
-        self.assertEqual(count, 3)
-
-    def test_profile_json_roundtrip(self):
-        from lean_demangle_profile import process_profile_file
-        strings = ["l_Lean_Meta_main", "malloc"]
-        profile = self._make_profile_shared(strings)
-
-        with tempfile.NamedTemporaryFile(mode='w', suffix='.json',
-                                         delete=False) as f:
-            json.dump(profile, f)
-            inpath = f.name
-
-        outpath = inpath.replace('.json', '-demangled.json')
-        try:
-            process_profile_file(inpath, outpath)
-            with open(outpath) as f:
-                result = json.load(f)
-            self.assertEqual(result["shared"]["stringArray"][0],
-                             "Lean.Meta.main")
-            self.assertEqual(result["shared"]["stringArray"][1], "malloc")
-        finally:
-            os.unlink(inpath)
-            if os.path.exists(outpath):
-                os.unlink(outpath)
-
-    def test_profile_gzip_roundtrip(self):
-        from lean_demangle_profile import process_profile_file
-        strings = ["l_Lean_Meta_main", "malloc"]
-        profile = self._make_profile_shared(strings)
-
-        with tempfile.NamedTemporaryFile(suffix='.json.gz',
-                                         delete=False) as f:
-            with gzip.open(f, 'wt') as gz:
-                json.dump(profile, gz)
-            inpath = f.name
-
-        outpath = inpath.replace('.json.gz', '-demangled.json.gz')
-        try:
-            process_profile_file(inpath, outpath)
-            with gzip.open(outpath, 'rt') as f:
-                result = json.load(f)
-            self.assertEqual(result["shared"]["stringArray"][0],
-                             "Lean.Meta.main")
-        finally:
-            os.unlink(inpath)
-            if os.path.exists(outpath):
-                os.unlink(outpath)
-
-
-if __name__ == '__main__':
-    unittest.main()
--- a/script/release_checklist.py
+++ b/script/release_checklist.py
@@ -11,7 +11,7 @@ IMPORTANT: Keep this documentation up-to-date when modifying the script's behavi
 What this script does:
 1. Validates preliminary Lean4 release infrastructure:
   - Checks that the release branch (releases/vX.Y.0) exists
-   - Verifies CMake version settings are correct
+   - Verifies CMake version settings are correct (both src/ and stage0/)
   - Confirms the release tag exists
   - Validates the release page exists on GitHub (created automatically by CI after tag push)
   - Checks the release notes page on lean-lang.org (updated while bumping the `reference-manual` repository)
@@ -326,6 +326,42 @@ def check_cmake_version(repo_url, branch, version_major, version_minor, github_t
    print(f"  ✅ CMake version settings are correct in {cmake_file_path}")
    return True

+def check_stage0_version(repo_url, branch, version_major, version_minor, github_token):
+    """Verify that stage0/src/CMakeLists.txt has the same version as src/CMakeLists.txt.
+
+    The stage0 pre-built binaries stamp .olean headers with their baked-in version.
+    If stage0 has a different version (e.g. from a 'begin development cycle' bump),
+    the release tarball will contain .olean files with the wrong version.
+    """
+    stage0_cmake = "stage0/src/CMakeLists.txt"
+    content = get_branch_content(repo_url, branch, stage0_cmake, github_token)
+    if content is None:
+        print(f"  ❌ Could not retrieve {stage0_cmake} from {branch}")
+        return False
+
+    errors = []
+    for line in content.splitlines():
+        stripped = line.strip()
+        if stripped.startswith("set(LEAN_VERSION_MAJOR "):
+            actual = stripped.split()[-1].rstrip(")")
+            if actual != str(version_major):
+                errors.append(f"LEAN_VERSION_MAJOR: expected {version_major}, found {actual}")
+        elif stripped.startswith("set(LEAN_VERSION_MINOR "):
+            actual = stripped.split()[-1].rstrip(")")
+            if actual != str(version_minor):
+                errors.append(f"LEAN_VERSION_MINOR: expected {version_minor}, found {actual}")
+
+    if errors:
+        print(f"  ❌ stage0 version mismatch in {stage0_cmake}:")
+        for error in errors:
+            print(f"     {error}")
+        print(f"     The stage0 compiler stamps .olean headers with its baked-in version.")
+        print(f"     Run `make update-stage0` to rebuild stage0 with the correct version.")
+        return False
+
+    print(f"  ✅ stage0 version matches in {stage0_cmake}")
+    return True
+
 def extract_org_repo_from_url(repo_url):
    """Extract the 'org/repo' part from a GitHub URL."""
    if repo_url.startswith("https://github.com/"):
@@ -441,7 +477,10 @@ def get_pr_ci_status(repo_url, pr_number, github_token):
    conclusions = [run['conclusion'] for run in check_runs if run.get('status') == 'completed']
    in_progress = [run for run in check_runs if run.get('status') in ['queued', 'in_progress']]

+    failed = sum(1 for c in conclusions if c in ['failure', 'timed_out', 'action_required'])
    if in_progress:
+        if failed > 0:
+            return "failure", f"{failed} check(s) failing, {len(in_progress)} still in progress"
        return "pending", f"{len(in_progress)} check(s) in progress"

    if not conclusions:
@@ -450,7 +489,6 @@ def get_pr_ci_status(repo_url, pr_number, github_token):
    if all(c == 'success' for c in conclusions):
        return "success", f"All {len(conclusions)} checks passed"

-    failed = sum(1 for c in conclusions if c in ['failure', 'timed_out', 'action_required'])
    if failed > 0:
        return "failure", f"{failed} check(s) failed"

@@ -680,6 +718,9 @@ def main():
        # Check CMake version settings
        if not check_cmake_version(lean_repo_url, branch_name, version_major, version_minor, github_token):
            lean4_success = False
+        # Check that stage0 version matches (stage0 stamps .olean headers with its version)
+        if not check_stage0_version(lean_repo_url, branch_name, version_major, version_minor, github_token):
+            lean4_success = False

    # Check for tag and release page
    if not tag_exists(lean_repo_url, toolchain, github_token):
@@ -924,8 +965,8 @@ def main():
            
            print(f"  ✅ Bump branch {bump_branch} exists")
            
-            # For batteries and mathlib4, update the lean-toolchain to the latest nightly
-            if branch_created and name in ["batteries", "mathlib4"]:
+            # Update the lean-toolchain to the latest nightly for newly created bump branches
+            if branch_created:
                latest_nightly = get_latest_nightly_tag(github_token)
                if latest_nightly:
                    nightly_toolchain = f"leanprover/lean4:{latest_nightly}"
@@ -965,14 +1006,15 @@ def main():
        # Find the actual minor version in CMakeLists.txt
        for line in cmake_lines:
            if line.strip().startswith("set(LEAN_VERSION_MINOR "):
-                actual_minor = int(line.split()[-1].rstrip(")"))
+                m = re.search(r'set\(LEAN_VERSION_MINOR\s+(\d+)', line)
+                actual_minor = int(m.group(1)) if m else 0
                version_minor_correct = actual_minor >= next_minor
                break
        else:
            version_minor_correct = False
            
        is_release_correct = any(
-            l.strip().startswith("set(LEAN_VERSION_IS_RELEASE 0)") 
+            re.match(r'set\(LEAN_VERSION_IS_RELEASE\s+0[\s)]', l.strip())
            for l in cmake_lines
        )
        
--- a/script/release_repos.yml
+++ b/script/release_repos.yml
@@ -65,13 +65,6 @@ repositories:
    branch: master
    dependencies: [lean4-unicode-basic]

-  - name: doc-gen4
-    url: https://github.com/leanprover/doc-gen4
-    toolchain-tag: true
-    stable-branch: false
-    branch: main
-    dependencies: [lean4-cli, BibtexQuery]
-
  - name: reference-manual
    url: https://github.com/leanprover/reference-manual
    toolchain-tag: true
@@ -84,8 +77,7 @@ repositories:
    toolchain-tag: false
    stable-branch: false
    branch: main
-    dependencies:
-      - batteries
+    dependencies: []

  - name: aesop
    url: https://github.com/leanprover-community/aesop
@@ -107,10 +99,16 @@ repositories:
      - lean4checker
      - batteries
      - lean4-cli
-      - doc-gen4
      - import-graph
      - plausible

+  - name: doc-gen4
+    url: https://github.com/leanprover/doc-gen4
+    toolchain-tag: true
+    stable-branch: false
+    branch: main
+    dependencies: [lean4-cli, BibtexQuery, mathlib4]
+
  - name: cslib
    url: https://github.com/leanprover/cslib
    toolchain-tag: true
--- a/script/release_steps.py
+++ b/script/release_steps.py
@@ -479,6 +479,25 @@ def execute_release_steps(repo, version, config):
        print(blue("Updating lakefile.toml..."))
        run_command(f'perl -pi -e \'s/"v4\\.[0-9]+(\\.[0-9]+)?(-rc[0-9]+)?"/"' + version + '"/g\' lakefile.*', cwd=repo_path)
        run_command("lake update", cwd=repo_path, stream_output=True)
+    elif repo_name == "verso":
+        # verso has nested Lake projects in test-projects/ that each have their own
+        # lake-manifest.json with a subverso pin. After updating the root manifest via
+        # `lake update`, sync the de-modulized subverso rev into all sub-manifests.
+        # The sub-projects use an old toolchain (v4.21.0) that doesn't support module/prelude
+        # syntax, so they need the de-modulized version (tagged no-modules/<root-rev>).
+        # The "SubVerso version consistency" CI check accepts either the root or de-modulized rev.
+        run_command("lake update", cwd=repo_path, stream_output=True)
+        print(blue("Syncing de-modulized subverso rev to test-project sub-manifests..."))
+        sync_script = (
+            'ROOT_REV=$(jq -r \'.packages[] | select(.name == "subverso") | .rev\' lake-manifest.json); '
+            'SUBVERSO_URL=$(jq -r \'.packages[] | select(.name == "subverso") | .url\' lake-manifest.json); '
+            'DEMOD_REV=$(git ls-remote "$SUBVERSO_URL" "refs/tags/no-modules/$ROOT_REV" | awk \'{print $1}\'); '
+            'find test-projects -name lake-manifest.json -print0 | while IFS= read -r -d \'\' f; do '
+            'jq --arg rev "$DEMOD_REV" \'.packages |= map(if .name == "subverso" then .rev = $rev else . end)\' "$f" > /tmp/lm_tmp.json && mv /tmp/lm_tmp.json "$f"; '
+            'done'
+        )
+        run_command(sync_script, cwd=repo_path)
+        print(green("Synced de-modulized subverso rev to all test-project sub-manifests"))
    elif dependencies:
        run_command(f'perl -pi -e \'s/"v4\\.[0-9]+(\\.[0-9]+)?(-rc[0-9]+)?"/"' + version + '"/g\' lakefile.*', cwd=repo_path)
        run_command("lake update", cwd=repo_path, stream_output=True)
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -1,6 +1,4 @@
-cmake_minimum_required(VERSION 3.10)
-cmake_policy(SET CMP0054 NEW)
-cmake_policy(SET CMP0110 NEW)
+cmake_minimum_required(VERSION 3.21)
 if(NOT CMAKE_GENERATOR MATCHES "Unix Makefiles")
  message(FATAL_ERROR "The only supported CMake generator at the moment is 'Unix Makefiles'")
 endif()
@@ -9,11 +7,17 @@ if(NOT DEFINED STAGE)
 endif()
 include(ExternalProject)
 project(LEAN CXX C)
-set(LEAN_VERSION_MAJOR 4)
-set(LEAN_VERSION_MINOR 29)
-set(LEAN_VERSION_PATCH 0)
-set(LEAN_VERSION_IS_RELEASE 0) # This number is 1 in the release revision, and 0 otherwise.
+set(LEAN_VERSION_MAJOR 4 CACHE STRING "")
+set(LEAN_VERSION_MINOR 30 CACHE STRING "")
+set(LEAN_VERSION_PATCH 0 CACHE STRING "")
+set(LEAN_VERSION_IS_RELEASE 0 CACHE STRING "") # This number is 1 in the release revision, and 0 otherwise.
 set(LEAN_SPECIAL_VERSION_DESC "" CACHE STRING "Additional version description like 'nightly-2018-03-11'")
+# project(LEAN) above implicitly creates empty LEAN_VERSION_{MAJOR,MINOR,PATCH}
+# normal variables (CMake sets <PROJECT>_VERSION_* for the project name). These
+# shadow the cache values. Remove them so ${VAR} falls through to the cache.
+unset(LEAN_VERSION_MAJOR)
+unset(LEAN_VERSION_MINOR)
+unset(LEAN_VERSION_PATCH)
 set(LEAN_VERSION_STRING "${LEAN_VERSION_MAJOR}.${LEAN_VERSION_MINOR}.${LEAN_VERSION_PATCH}")
 if(LEAN_SPECIAL_VERSION_DESC)
  string(APPEND LEAN_VERSION_STRING "-${LEAN_SPECIAL_VERSION_DESC}")
@@ -83,6 +87,8 @@ option(USE_GITHASH "GIT_HASH" ON)
 option(INSTALL_LICENSE "INSTALL_LICENSE" ON)
 # When ON we install a copy of cadical
 option(INSTALL_CADICAL "Install a copy of cadical" ON)
+# When ON we install a copy of leantar
+option(INSTALL_LEANTAR "Install a copy of leantar" ON)

 # FLAGS for disabling optimizations and debugging
 option(FREE_VAR_RANGE_OPT "FREE_VAR_RANGE_OPT" ON)
@@ -753,6 +759,14 @@ if(STAGE GREATER 0 AND CADICAL AND INSTALL_CADICAL)
  add_dependencies(leancpp copy-cadical)
 endif()

+if(STAGE GREATER 0 AND LEANTAR AND INSTALL_LEANTAR)
+  add_custom_target(
+    copy-leantar
+    COMMAND cmake -E copy_if_different "${LEANTAR}" "${CMAKE_BINARY_DIR}/bin/leantar${CMAKE_EXECUTABLE_SUFFIX}"
+  )
+  add_dependencies(leancpp copy-leantar)
+endif()
+
 # MSYS2 bash usually handles Windows paths relatively well, but not when putting them in the PATH
 string(REGEX REPLACE "^([a-zA-Z]):" "/\\1" LEAN_BIN "${CMAKE_BINARY_DIR}/bin")

@@ -909,6 +923,10 @@ if(STAGE GREATER 0 AND CADICAL AND INSTALL_CADICAL)
  install(PROGRAMS "${CADICAL}" DESTINATION bin)
 endif()

+if(STAGE GREATER 0 AND LEANTAR AND INSTALL_LEANTAR)
+  install(PROGRAMS "${LEANTAR}" DESTINATION bin)
+endif()
+
 add_custom_target(
  clean-stdlib
  COMMAND rm -rf "${CMAKE_BINARY_DIR}/lib" || true
--- a/src/Init.lean
+++ b/src/Init.lean
@@ -30,6 +30,7 @@ public import Init.Hints
 public import Init.Conv
 public import Init.Guard
 public import Init.Simproc
+public import Init.CbvSimproc
 public import Init.SizeOfLemmas
 public import Init.BinderPredicates
 public import Init.Ext
--- a/src/Init/CbvSimproc.lean
+++ b/src/Init/CbvSimproc.lean
@@ -0,0 +1,71 @@
+/-
+Copyright (c) 2026 Lean FRO, LLC. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Wojciech Różowski
+-/
+module
+
+prelude
+public meta import Init.Data.ToString.Name  -- shake: keep (transitive public meta dep, fix)
+public import Init.Tactics
+import Init.Meta.Defs
+
+public section
+
+namespace Lean.Parser
+
+syntax cbvSimprocEval := "cbv_eval"
+
+/--
+A user-defined simplification procedure used by the `cbv` tactic.
+The body must have type `Lean.Meta.Sym.Simp.Simproc` (`Expr → SimpM Result`).
+Procedures are indexed by a discrimination tree pattern and fire at one of three phases:
+`↓` (pre), `cbv_eval` (eval), or `↑` (post, default).
+-/
+syntax (docComment)? attrKind "cbv_simproc " (Tactic.simpPre <|> Tactic.simpPost <|> cbvSimprocEval)? ident " (" term ")" " := " term : command
+
+/--
+A `cbv_simproc` declaration without automatically adding it to the cbv simproc set.
+To activate, use `attribute [cbv_simproc]`.
+-/
+syntax (docComment)? "cbv_simproc_decl " ident " (" term ")" " := " term : command
+
+syntax (docComment)? attrKind "builtin_cbv_simproc " (Tactic.simpPre <|> Tactic.simpPost <|> cbvSimprocEval)? ident " (" term ")" " := " term : command
+
+syntax (docComment)? "builtin_cbv_simproc_decl " ident " (" term ")" " := " term : command
+
+syntax (name := cbvSimprocPattern) "cbv_simproc_pattern% " term " => " ident : command
+
+syntax (name := cbvSimprocPatternBuiltin) "builtin_cbv_simproc_pattern% " term " => " ident : command
+
+namespace Attr
+
+syntax (name := cbvSimprocAttr) "cbv_simproc" (Tactic.simpPre <|> Tactic.simpPost <|> cbvSimprocEval)? : attr
+
+syntax (name := cbvSimprocBuiltinAttr) "builtin_cbv_simproc" (Tactic.simpPre <|> Tactic.simpPost <|> cbvSimprocEval)? : attr
+
+end Attr
+
+macro_rules
+  | `($[$doc?:docComment]? cbv_simproc_decl $n:ident ($pattern:term) := $body) => do
+    let simprocType := `Lean.Meta.Sym.Simp.Simproc
+    `($[$doc?:docComment]? meta def $n:ident : $(mkIdent simprocType) := $body
+      cbv_simproc_pattern% $pattern => $n)
+
+macro_rules
+  | `($[$doc?:docComment]? builtin_cbv_simproc_decl $n:ident ($pattern:term) := $body) => do
+    let simprocType := `Lean.Meta.Sym.Simp.Simproc
+    `($[$doc?:docComment]? def $n:ident : $(mkIdent simprocType) := $body
+      builtin_cbv_simproc_pattern% $pattern => $n)
+
+macro_rules
+  | `($[$doc?:docComment]? $kind:attrKind cbv_simproc $[$phase?]? $n:ident ($pattern:term) := $body) => do
+    `($[$doc?:docComment]? cbv_simproc_decl $n ($pattern) := $body
+      attribute [$kind cbv_simproc $[$phase?]?] $n)
+
+macro_rules
+  | `($[$doc?:docComment]? $kind:attrKind builtin_cbv_simproc $[$phase?]? $n:ident ($pattern:term) := $body) => do
+    `($[$doc?:docComment]? builtin_cbv_simproc_decl $n ($pattern) := $body
+      attribute [$kind builtin_cbv_simproc $[$phase?]?] $n)
+
+end Lean.Parser
--- a/src/Init/Classical.lean
+++ b/src/Init/Classical.lean
@@ -69,9 +69,11 @@ theorem em (p : Prop) : p ∨ ¬p :=
 theorem exists_true_of_nonempty {α : Sort u} : Nonempty α → ∃ _ : α, True
  | ⟨x⟩ => ⟨x, trivial⟩

+@[implicit_reducible]
 noncomputable def inhabited_of_nonempty {α : Sort u} (h : Nonempty α) : Inhabited α :=
  ⟨choice h⟩

+@[implicit_reducible]
 noncomputable def inhabited_of_exists {α : Sort u} {p : α → Prop} (h : ∃ x, p x) : Inhabited α :=
  inhabited_of_nonempty (Exists.elim h (fun w _ => ⟨w⟩))

@@ -81,6 +83,7 @@ noncomputable scoped instance (priority := low) propDecidable (a : Prop) : Decid
    | Or.inl h => ⟨isTrue h⟩
    | Or.inr h => ⟨isFalse h⟩

+@[implicit_reducible]
 noncomputable def decidableInhabited (a : Prop) : Inhabited (Decidable a) where
  default := inferInstance

@@ -142,7 +145,7 @@ is classically true but not constructively. -/

 /-- Transfer decidability of `¬ p` to decidability of `p`. -/
 -- This can not be an instance as it would be tried everywhere.
-@[instance_reducible]
+@[implicit_reducible]
 def decidable_of_decidable_not (p : Prop) [h : Decidable (¬ p)] : Decidable p :=
  match h with
  | isFalse h => isTrue (Classical.not_not.mp h)
--- a/src/Init/Control.lean
+++ b/src/Init/Control.lean
@@ -18,3 +18,4 @@ public import Init.Control.StateCps
 public import Init.Control.ExceptCps
 public import Init.Control.MonadAttach
 public import Init.Control.EState
+public import Init.Control.Do
--- a/src/Init/Control/Do.lean
+++ b/src/Init/Control/Do.lean
@@ -0,0 +1,63 @@
+/-
+Copyright (c) 2025 Lean FRO LLC. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Sebastian Graf
+-/
+module
+
+prelude
+public import Init.Control.Except
+public import Init.Control.Option
+
+public section
+
+/-!
+This module provides specialized wrappers around `ExceptT` to support the `do` elaborator.
+
+Specifically, the types here are used to tunnel early `return`, `break` and `continue` through
+non-algebraic higher-order effect combinators such as `tryCatch`.
+-/
+
+/-- A wrapper around `ExceptT` signifying early return. -/
+@[expose]
+abbrev EarlyReturnT (ρ m α) := ExceptT ρ m α
+
+/-- Exit a computation by returning a value `r : ρ` early. -/
+@[always_inline, inline, expose]
+abbrev EarlyReturnT.return {ρ m α} [Monad m] (r : ρ) : EarlyReturnT ρ m α :=
+  throw r
+
+/-- A specialization of `Except.casesOn`. -/
+@[always_inline, inline, expose]
+abbrev EarlyReturn.runK {ρ α : Type u} {β : Type v} (x : Except ρ α) (ret : ρ → β) (pure : α → β) : β :=
+  x.casesOn ret pure
+
+/-- A wrapper around `OptionT` signifying `break` in a loop. -/
+@[expose]
+abbrev BreakT := OptionT
+
+/-- Exit a loop body via `break`. -/
+@[always_inline, inline, expose]
+abbrev BreakT.break {m : Type w → Type x} [Monad m] : BreakT m α := failure
+
+/-- A specialization of `Option.casesOn`. -/
+@[always_inline, inline, expose]
+abbrev Break.runK {α : Type u} {β : Type v} (x : Option α) (breakK : Unit → β) (successK : α → β) : β :=
+  -- Note: The matcher below is used in the elaborator targeting `forIn` loops.
+  -- If you change the order of match arms here, you may need to adjust the elaborator.
+  match x with
+  | some a => successK a
+  | none => breakK ()
+
+/-- A wrapper around `OptionT` signifying `continue` in a loop. -/
+@[expose]
+abbrev ContinueT := OptionT
+
+/-- Exit a loop body via `continue`. -/
+@[always_inline, inline, expose]
+abbrev ContinueT.continue {m : Type w → Type x} [Monad m] : ContinueT m α := failure
+
+/-- A specialization of `Option.casesOn`. -/
+@[always_inline, inline, expose]
+abbrev Continue.runK {α : Type u} {β : Type v} (x : Option α) (continueK : Unit → β) (successK : α → β) : β :=
+  x.casesOn continueK (fun a _ => successK a) ()
--- a/src/Init/Control/Id.lean
+++ b/src/Init/Control/Id.lean
@@ -49,6 +49,7 @@ instance : Monad Id where
 /--
 The identity monad has a `bind` operator.
 -/
+@[implicit_reducible]
 def hasBind : Bind Id :=
  inferInstance

@@ -58,7 +59,7 @@ Runs a computation in the identity monad.
 This function is the identity function. Because its parameter has type `Id α`, it causes
 `do`-notation in its arguments to use the `Monad Id` instance.
 -/
-@[always_inline, inline, expose]
+@[always_inline, inline, expose, implicit_reducible]
 protected def run (x : Id α) : α := x

 instance [OfNat α n] : OfNat (Id α) n :=
@@ -79,3 +80,11 @@ instance : LawfulMonadAttach Id where
    exact x.run.2

 end Id
+
+/-- Turn a collection with a pure `ForIn` instance into an array. -/
+def ForIn.toArray {α : Type u} [inst : ForIn Id ρ α] (xs : ρ) : Array α :=
+  ForIn.forIn xs Array.empty (fun a acc => pure (.yield (acc.push a))) |> Id.run
+
+/-- Turn a collection with a pure `ForIn` instance into a list. -/
+def ForIn.toList {α : Type u} [ForIn Id ρ α] (xs : ρ) : List α :=
+  ForIn.toArray xs |>.toList
--- a/src/Init/Control/Lawful/Basic.lean
+++ b/src/Init/Control/Lawful/Basic.lean
@@ -254,8 +254,8 @@ instance : LawfulMonad Id := by
@[simp, grind =] theorem run_bind (x : Id α) (f : α → Id β) : (x >>= f).run = (f x.run).run := rfl
@[simp, grind =] theorem run_pure (a : α) : (pure a : Id α).run = a := rfl
@[simp, grind =] theorem pure_run (a : Id α) : pure a.run = a := rfl
-@[simp] theorem run_seqRight (x y : Id α) : (x *> y).run = y.run := rfl
-@[simp] theorem run_seqLeft (x y : Id α) : (x <* y).run = x.run := rfl
+@[simp] theorem run_seqRight (x : Id α) (y : Id β) : (x *> y).run = y.run := rfl
+@[simp] theorem run_seqLeft (x : Id α) (y : Id β) : (x <* y).run = x.run := rfl
@[simp] theorem run_seq (f : Id (α → β)) (x : Id α) : (f <*> x).run = f.run x.run := rfl

 end Id
--- a/src/Init/Control/Lawful/Instances.lean
+++ b/src/Init/Control/Lawful/Instances.lean
@@ -258,7 +258,6 @@ instance [Monad m] [LawfulMonad m] : LawfulMonad (OptionT m) where
  rw [← bind_pure_comp]
  rfl

-set_option backward.isDefEq.respectTransparency false in
@[simp] theorem run_controlAt [Monad m] [LawfulMonad m] (f : ({β : Type u} → OptionT m β → m (stM m (OptionT m) β)) → m (stM m (OptionT m) α)) :
    OptionT.run (controlAt m f) = f fun x => x.run := by
  simp [controlAt, Option.elimM, Option.elim]
@@ -346,7 +345,6 @@ instance [Monad m] [LawfulMonad m] : LawfulMonad (ReaderT ρ m) where
    ReaderT.run (liftWith f) ctx = (f fun x => x.run ctx) :=
  rfl

-set_option backward.isDefEq.respectTransparency false in
@[simp] theorem run_controlAt [Monad m] [LawfulMonad m] (f : ({β : Type u} → ReaderT ρ m β → m (stM m (ReaderT ρ m) β)) → m (stM m (ReaderT ρ m) α)) (ctx : ρ) :
    ReaderT.run (controlAt m f) ctx = f fun x => x.run ctx := by
  simp [controlAt]
@@ -441,13 +439,11 @@ instance [Monad m] [LawfulMonad m] : LawfulMonad (StateT σ m) where
@[simp] theorem run_restoreM [Monad m] [LawfulMonad m] (x : stM m (StateT σ m) α) (s : σ) :
    StateT.run (restoreM x) s = pure x := by
  simp [restoreM, MonadControl.restoreM]
-  rfl

@[simp] theorem run_liftWith [Monad m] [LawfulMonad m] (f : ({β : Type u} → StateT σ m β → m (stM m (StateT σ m) β)) → m α) (s : σ) :
    StateT.run (liftWith f) s = ((·, s) <$> f fun x => x.run s) := by
  simp [liftWith, MonadControl.liftWith, Function.comp_def]

-set_option backward.isDefEq.respectTransparency false in
@[simp] theorem run_controlAt [Monad m] [LawfulMonad m] (f : ({β : Type u} → StateT σ m β → m (stM m (StateT σ m) β)) → m (stM m (StateT σ m) α)) (s : σ) :
    StateT.run (controlAt m f) s = f fun x => x.run s := by
  simp [controlAt]
--- a/src/Init/Control/MonadAttach.lean
+++ b/src/Init/Control/MonadAttach.lean
@@ -70,7 +70,7 @@ information to the return value, except a trivial proof of {name}`True`.
 This instance is used whenever no more useful {name}`MonadAttach` instance can be implemented.
 It always has a {name}`WeaklyLawfulMonadAttach`, but usually no {name}`LawfulMonadAttach` instance.
 -/
-@[expose, instance_reducible]
+@[expose, implicit_reducible]
 public protected def MonadAttach.trivial {m : Type u → Type v} [Monad m] : MonadAttach m where
  CanReturn _ _ := True
  attach x := (⟨·, .intro⟩) <$> x
--- a/src/Init/Conv.lean
+++ b/src/Init/Conv.lean
@@ -280,7 +280,7 @@ resulting in `t'`, which becomes the new target subgoal. -/
 syntax (name := convConvSeq) "conv" " => " convSeq : conv

 /-- `· conv` focuses on the main conv goal and tries to solve it using `s`. -/
-macro dot:patternIgnore("· " <|> ". ") s:convSeq : conv => `(conv| {%$dot ($s) })
+macro dot:unicode("· ", ". ") s:convSeq : conv => `(conv| {%$dot ($s) })


 /-- `fail_if_success t` fails if the tactic `t` succeeds. -/
--- a/src/Init/Core.lean
+++ b/src/Init/Core.lean
@@ -1339,10 +1339,10 @@ transitive and contains `r`. `TransGen r a z` if and only if there exists a sequ
 -/
 inductive Relation.TransGen {α : Sort u} (r : α → α → Prop) : α → α → Prop
  /-- If `r a b`, then `TransGen r a b`. This is the base case of the transitive closure. -/
-  | single {a b} : r a b → TransGen r a b
+  | single {a b : α} : r a b → TransGen r a b
  /-- If `TransGen r a b` and `r b c`, then `TransGen r a c`.
  This is the inductive case of the transitive closure. -/
-  | tail {a b c} : TransGen r a b → r b c → TransGen r a c
+  | tail {a b c : α} : TransGen r a b → r b c → TransGen r a c

 /-- The transitive closure is transitive. -/
 theorem Relation.TransGen.trans {α : Sort u} {r : α → α → Prop} {a b c} :
@@ -2593,3 +2593,11 @@ class Trichotomous (r : α → α → Prop) : Prop where
  trichotomous (a b : α) : ¬ r a b → ¬ r b a → a = b

 end Std
+
+@[simp] theorem flip_flip {α : Sort u} {β : Sort v} {φ : Sort w} {f : α → β → φ} :
+    flip (flip f) = f := by
+  apply funext
+  intro a
+  apply funext
+  intro b
+  rw [flip, flip]
--- a/src/Init/Data/Array.lean
+++ b/src/Init/Data/Array.lean
@@ -34,3 +34,4 @@ public import Init.Data.Array.MinMax
 public import Init.Data.Array.Nat
 public import Init.Data.Array.Int
 public import Init.Data.Array.Count
+public import Init.Data.Array.Sort
--- a/src/Init/Data/Array/Basic.lean
+++ b/src/Init/Data/Array/Basic.lean
@@ -6,6 +6,7 @@ Authors: Leonardo de Moura
 module

 prelude
+public import Init.Control.Do
 public import Init.GetElem
 public import Init.Data.List.ToArrayImpl
 import all Init.Data.List.ToArrayImpl
@@ -93,7 +94,7 @@ theorem ext' {xs ys : Array α} (h : xs.toList = ys.toList) : xs = ys := by

@[simp, grind =] theorem getElem?_toList {xs : Array α} {i : Nat} : xs.toList[i]? = xs[i]? := by
  simp only [getElem?_def, getElem_toList]
-  simp only [Array.size]; rfl
+  simp only [Array.size]

 /-- `a ∈ as` is a predicate which asserts that `a` is in the array `as`. -/
 -- NB: This is defined as a structure rather than a plain def so that a lemma
@@ -147,6 +148,9 @@ end List

 namespace Array

+@[simp, grind =] theorem getElem!_toList [Inhabited α] {xs : Array α} {i : Nat} : xs.toList[i]! = xs[i]! := by
+  rw [List.getElem!_toArray]
+
 theorem size_eq_length_toList {xs : Array α} : xs.size = xs.toList.length := rfl

 /-! ### Externs -/
@@ -170,6 +174,15 @@ This avoids overhead due to unboxing a `Nat` used as an index.
 def uget (xs : @& Array α) (i : USize) (h : i.toNat < xs.size) : α :=
  xs[i.toNat]

+/--
+Version of `Array.uget` that does not increment the reference count of its result.
+
+This is only intended for direct use by the compiler.
+-/
+@[extern "lean_array_uget_borrowed"]
+unsafe opaque ugetBorrowed (xs : @& Array α) (i : USize) (h : i.toNat < xs.size) : α :=
+  xs.uget i h
+
 /--
 Low-level modification operator which is as fast as a C array write. The modification is performed
 in-place when the reference to the array is unique.
@@ -273,7 +286,7 @@ Examples:
 * `#[1, 2].isEmpty = false`
 * `#[()].isEmpty = false`
 -/
-@[expose]
+@[expose, inline]
 def isEmpty (xs : Array α) : Bool :=
  xs.size = 0

@@ -367,6 +380,7 @@ Returns the last element of an array, or panics if the array is empty.
 Safer alternatives include `Array.back`, which requires a proof the array is non-empty, and
 `Array.back?`, which returns an `Option`.
 -/
+@[inline]
 def back! [Inhabited α] (xs : Array α) : α :=
  xs[xs.size - 1]!

@@ -376,6 +390,7 @@ Returns the last element of an array, given a proof that the array is not empty.
 See `Array.back!` for the version that panics if the array is empty, or `Array.back?` for the
 version that returns an option.
 -/
+@[inline]
 def back (xs : Array α) (h : 0 < xs.size := by get_elem_tactic) : α :=
  xs[xs.size - 1]'(Nat.sub_one_lt_of_lt h)

@@ -385,6 +400,7 @@ Returns the last element of an array, or `none` if the array is empty.
 See `Array.back!` for the version that panics if the array is empty, or `Array.back` for the version
 that requires a proof the array is non-empty.
 -/
+@[inline]
 def back? (xs : Array α) : Option α :=
  xs[xs.size - 1]?

@@ -2135,7 +2151,4 @@ protected def repr {α : Type u} [Repr α] (xs : Array α) : Std.Format :=
 instance {α : Type u} [Repr α] : Repr (Array α) where
  reprPrec xs _ := Array.repr xs

-instance [ToString α] : ToString (Array α) where
-  toString xs := String.Internal.append "#" (toString xs.toList)
-
 end Array
--- a/src/Init/Data/Array/Find.lean
+++ b/src/Init/Data/Array/Find.lean
@@ -723,7 +723,6 @@ theorem findFinIdx?_eq_bind_find?_finIdxOf? [BEq α] [LawfulBEq α] {xs : Array
    xs.findFinIdx? p = (xs.find? p).bind (xs.finIdxOf? ·) := by
  cases xs
  simp [List.findFinIdx?_eq_bind_find?_finIdxOf?]
-  rfl

 theorem findIdx_eq_getD_bind_find?_idxOf? [BEq α] [LawfulBEq α] {xs : Array α} {p : α → Bool} :
    xs.findIdx p = ((xs.find? p).bind (xs.idxOf? ·)).getD xs.size := by
--- a/src/Init/Data/Array/Lemmas.lean
+++ b/src/Init/Data/Array/Lemmas.lean
@@ -72,6 +72,9 @@ theorem toArray_eq : List.toArray as = xs ↔ as = xs.toList := by

 /-! ### size -/

+theorem size_singleton {x : α} : #[x].size = 1 := by
+  simp
+
 theorem eq_empty_of_size_eq_zero (h : xs.size = 0) : xs = #[] := by
  cases xs
  simp_all
@@ -896,7 +899,7 @@ theorem all_push {xs : Array α} {a : α} {p : α → Bool} :
@[simp] theorem getElem_set_ne {xs : Array α} {i : Nat} (h' : i < xs.size) {v : α} {j : Nat}
    (pj : j < xs.size) (h : i ≠ j) :
    (xs.set i v)[j]'(by simp [*]) = xs[j] := by
-  simp only [set, ← getElem_toList, List.getElem_set_ne h]; rfl
+  simp only [set, ← getElem_toList, List.getElem_set_ne h]

@[simp] theorem getElem?_set_ne {xs : Array α} {i : Nat} (h : i < xs.size) {v : α} {j : Nat}
    (ne : i ≠ j) : (xs.set i v)[j]? = xs[j]? := by
@@ -2855,7 +2858,7 @@ theorem getElem?_extract {xs : Array α} {start stop : Nat} :
  · simp only [length_toList, size_extract, List.length_take, List.length_drop]
    omega
  · intro n h₁ h₂
-    simp; rfl
+    simp

@[simp] theorem extract_size {xs : Array α} : xs.extract 0 xs.size = xs := by
  apply ext
@@ -3483,6 +3486,21 @@ theorem foldl_eq_foldr_reverse {xs : Array α} {f : β → α → β} {b} :
 theorem foldr_eq_foldl_reverse {xs : Array α} {f : α → β → β} {b} :
    xs.foldr f b = xs.reverse.foldl (fun x y => f y x) b := by simp

+theorem foldl_eq_apply_foldr {xs : Array α} {f : α → α → α}
+    [Std.Associative f] [Std.LawfulRightIdentity f init] :
+    xs.foldl f x = f x (xs.foldr f init) := by
+  simp [← foldl_toList, ← foldr_toList, List.foldl_eq_apply_foldr]
+
+theorem foldr_eq_apply_foldl {xs : Array α} {f : α → α → α}
+    [Std.Associative f] [Std.LawfulLeftIdentity f init] :
+    xs.foldr f x = f (xs.foldl f init) x := by
+  simp [← foldl_toList, ← foldr_toList, List.foldr_eq_apply_foldl]
+
+theorem foldr_eq_foldl {xs : Array α} {f : α → α → α}
+    [Std.Associative f] [Std.LawfulIdentity f init] :
+    xs.foldr f init = xs.foldl f init := by
+  simp [foldl_eq_apply_foldr, Std.LawfulLeftIdentity.left_id]
+
@[simp] theorem foldr_push_eq_append {as : Array α} {bs : Array β} {f : α → β} (w : start = as.size) :
    as.foldr (fun a xs => Array.push xs (f a)) bs start 0 = bs ++ (as.map f).reverse := by
  subst w
@@ -4335,16 +4353,33 @@ def sum_eq_sum_toList := @sum_toList

@[simp, grind =]
 theorem sum_append [Zero α] [Add α] [Std.Associative (α := α) (· + ·)]
-    [Std.LeftIdentity (α := α) (· + ·) 0] [Std.LawfulLeftIdentity (α := α) (· + ·) 0]
+    [Std.LawfulLeftIdentity (α := α) (· + ·) 0]
    {as₁ as₂ : Array α} : (as₁ ++ as₂).sum = as₁.sum + as₂.sum := by
  simp [← sum_toList, List.sum_append]

+@[simp, grind =]
+theorem sum_singleton [Add α] [Zero α] [Std.LawfulRightIdentity (· + ·) (0 : α)] {x : α} :
+    #[x].sum = x := by
+  simp [Array.sum_eq_foldr, Std.LawfulRightIdentity.right_id x]
+
+@[simp, grind =]
+theorem sum_push [Add α] [Zero α] [Std.Associative (α := α) (· + ·)]
+    [Std.LawfulIdentity (· + ·) (0 : α)] {xs : Array α} {x : α} :
+    (xs.push x).sum = xs.sum + x := by
+  simp [Array.sum_eq_foldr, Std.LawfulRightIdentity.right_id, Std.LawfulLeftIdentity.left_id,
+    ← Array.foldr_assoc]
+
@[simp, grind =]
 theorem sum_reverse [Zero α] [Add α] [Std.Associative (α := α) (· + ·)]
    [Std.Commutative (α := α) (· + ·)]
    [Std.LawfulLeftIdentity (α := α) (· + ·) 0] (xs : Array α) : xs.reverse.sum = xs.sum := by
  simp [← sum_toList, List.sum_reverse]

+theorem sum_eq_foldl [Zero α] [Add α] [Std.Associative (α := α) (· + ·)]
+    [Std.LawfulIdentity (· + ·) (0 : α)] {xs : Array α} :
+    xs.sum = xs.foldl (init := 0) (· + ·) := by
+  simp [← sum_toList, List.sum_eq_foldl]
+
 theorem foldl_toList_eq_flatMap {l : List α} {acc : Array β}
    {F : Array β → α → Array β} {G : α → List β}
    (H : ∀ acc a, (F acc a).toList = acc.toList ++ G a) :
--- a/src/Init/Data/Array/Lex/Lemmas.lean
+++ b/src/Init/Data/Array/Lex/Lemmas.lean
@@ -78,7 +78,7 @@ private theorem cons_lex_cons [BEq α] {lt : α → α → Bool} {a b : α} {xs
  simp only [lex, size_append, List.size_toArray, List.length_cons, List.length_nil, Nat.zero_add,
    Nat.add_min_add_left, Nat.add_lt_add_iff_left, Std.Rco.forIn'_eq_forIn'_toList]
  rw [cons_lex_cons.forIn'_congr_aux (Nat.toList_rco_eq_cons (by omega)) rfl (fun _ _ _ => rfl)]
-  simp only [bind_pure_comp, map_pure, Nat.toList_rco_succ_succ, Nat.add_comm 1]
+  simp only [Nat.toList_rco_succ_succ, Nat.add_comm 1]
  cases h : lt a b
  · cases h' : a == b <;> simp [bne, *]
  · simp [*]
--- a/src/Init/Data/Array/MapIdx.lean
+++ b/src/Init/Data/Array/MapIdx.lean
@@ -72,7 +72,6 @@ theorem mapFinIdx_spec {xs : Array α} {f : (i : Nat) → α → (h : i < xs.siz
  simp only [getElem?_def, size_mapFinIdx, getElem_mapFinIdx]
  split <;> simp_all

-set_option backward.isDefEq.respectTransparency false in
@[simp, grind =] theorem toList_mapFinIdx {xs : Array α} {f : (i : Nat) → α → (h : i < xs.size) → β} :
    (xs.mapFinIdx f).toList = xs.toList.mapFinIdx (fun i a h => f i a (by simpa)) := by
  apply List.ext_getElem <;> simp
@@ -106,7 +105,6 @@ theorem mapIdx_spec {f : Nat → α → β} {xs : Array α}
      xs[i]?.map (f i) := by
  simp [getElem?_def, size_mapIdx, getElem_mapIdx]

-set_option backward.isDefEq.respectTransparency false in
@[simp, grind =] theorem toList_mapIdx {f : Nat → α → β} {xs : Array α} :
    (xs.mapIdx f).toList = xs.toList.mapIdx (fun i a => f i a) := by
  apply List.ext_getElem <;> simp
--- a/src/Init/Data/Array/OfFn.lean
+++ b/src/Init/Data/Array/OfFn.lean
@@ -41,7 +41,6 @@ theorem ofFn_succ {f : Fin (n+1) → α} :
    intro h₃
    simp only [show i = n by omega]

-set_option backward.isDefEq.respectTransparency false in
 theorem ofFn_add {n m} {f : Fin (n + m) → α} :
    ofFn f = (ofFn (fun i => f (i.castLE (Nat.le_add_right n m)))) ++ (ofFn (fun i => f (i.natAdd n))) := by
  induction m with
@@ -108,7 +107,6 @@ theorem ofFnM_succ {n} [Monad m] [LawfulMonad m] {f : Fin (n + 1) → m α} :
      pure (as.push a)) := by
  simp [ofFnM, Fin.foldlM_succ_last]

-set_option backward.isDefEq.respectTransparency false in
 theorem ofFnM_add {n m} [Monad m] [LawfulMonad m] {f : Fin (n + k) → m α} :
    ofFnM f = (do
      let as ← ofFnM fun i : Fin n => f (i.castLE (Nat.le_add_right n k))
--- a/src/Init/Data/Array/Perm.lean
+++ b/src/Init/Data/Array/Perm.lean
@@ -126,6 +126,14 @@ theorem swap_perm {xs : Array α} {i j : Nat} (h₁ : i < xs.size) (h₂ : j < x
  simp only [swap, perm_iff_toList_perm, toList_set]
  apply set_set_perm

+theorem Perm.pairwise_iff {R : α → α → Prop} (S : ∀ {x y}, R x y → R y x) {xs ys : Array α}
+    : ∀ _p : xs.Perm ys, xs.toList.Pairwise R ↔ ys.toList.Pairwise R := by
+  simpa only [perm_iff_toList_perm] using List.Perm.pairwise_iff S
+
+theorem Perm.pairwise {R : α → α → Prop} {xs ys : Array α} (hp : xs ~ ys)
+    (hR : xs.toList.Pairwise R) (hsymm : ∀ {x y}, R x y → R y x) :
+    ys.toList.Pairwise R := (hp.pairwise_iff hsymm).mp hR
+
 namespace Perm

 set_option linter.indexVariables false in
--- a/src/Init/Data/Array/Sort.lean
+++ b/src/Init/Data/Array/Sort.lean
@@ -0,0 +1,10 @@
+/-
+Copyright (c) 2026 Lean FRO. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Paul Reichert
+-/
+module
+
+prelude
+public import Init.Data.Array.Sort.Basic
+public import Init.Data.Array.Sort.Lemmas
--- a/src/Init/Data/Array/Sort/Basic.lean
+++ b/src/Init/Data/Array/Sort/Basic.lean
@@ -0,0 +1,55 @@
+/-
+Copyright (c) 2026 Lean FRO. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Paul Reichert
+-/
+module
+
+prelude
+public import Init.Data.Array.Subarray.Split
+public import Init.Data.Slice.Array
+import Init.Omega
+
+public section
+
+private def Array.MergeSort.Internal.merge (xs ys : Array α) (le : α → α → Bool := by exact (· ≤ ·)) :
+    Array α :=
+  if hxs : 0 < xs.size then
+    if hys : 0 < ys.size then
+      go xs[*...*] ys[*...*] (by simp only [Array.size_mkSlice_rii]; omega) (by simp only [Array.size_mkSlice_rii]; omega) (Array.emptyWithCapacity (xs.size + ys.size))
+    else
+      xs
+  else
+    ys
+where
+  go (xs ys : Subarray α) (hxs : 0 < xs.size) (hys : 0 < ys.size) (acc : Array α) : Array α :=
+    let x := xs[0]
+    let y := ys[0]
+    if le x y then
+      if hi : 1 < xs.size then
+        go (xs.drop 1) ys (by simp only [Subarray.size_drop]; omega) hys (acc.push x)
+      else
+        ys.foldl (init := acc.push x) (fun acc y => acc.push y)
+    else
+      if hj : 1 < ys.size then
+        go xs (ys.drop 1) hxs (by simp only [Subarray.size_drop]; omega) (acc.push y)
+      else
+        xs.foldl (init := acc.push y) (fun acc x => acc.push x)
+  termination_by xs.size + ys.size
+
+def Subarray.mergeSort (xs : Subarray α) (le : α → α → Bool := by exact (· ≤ ·)) : Array α :=
+    if h : 1 < xs.size then
+      let splitIdx := (xs.size + 1) / 2 -- We follow the same splitting convention as `List.mergeSort`
+      let left := xs[*...splitIdx]
+      let right := xs[splitIdx...*]
+      Array.MergeSort.Internal.merge (mergeSort left le) (mergeSort right le) le
+    else
+      xs.toArray
+termination_by xs.size
+decreasing_by
+  · simp only [Subarray.size_mkSlice_rio]; omega
+  · simp only [Subarray.size_mkSlice_rci]; omega
+
+@[inline]
+def Array.mergeSort (xs : Array α) (le : α → α → Bool := by exact (· ≤ ·)) : Array α :=
+    xs[*...*].mergeSort le
--- a/src/Init/Data/Array/Sort/Lemmas.lean
+++ b/src/Init/Data/Array/Sort/Lemmas.lean
@@ -0,0 +1,240 @@
+/-
+Copyright (c) 2026 Lean FRO. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Paul Reichert
+-/
+module
+
+prelude
+public import Init.Data.Array.Sort.Basic
+public import Init.Data.List.Sort.Basic
+public import Init.Data.Array.Perm
+import all Init.Data.Array.Sort.Basic
+import all Init.Data.List.Sort.Basic
+import Init.Data.List.Sort.Lemmas
+import Init.Data.Slice.Array.Lemmas
+import Init.Data.Slice.List.Lemmas
+import Init.Data.Array.Bootstrap
+import Init.Data.Array.Lemmas
+import Init.Data.Array.MapIdx
+import Init.ByCases
+
+public section
+
+private theorem Array.MergeSort.merge.go_eq_listMerge {xs ys : Subarray α} {hxs hys le acc} :
+    (Array.MergeSort.Internal.merge.go le xs ys hxs hys acc).toList = acc.toList ++ List.merge xs.toList ys.toList le := by
+  fun_induction Array.MergeSort.Internal.merge.go le xs ys hxs hys acc
+  · rename_i xs ys _ _ _ _ _ _ _ _
+    rw [List.merge.eq_def]
+    split
+    · have : xs.size = 0 := by simp [← Subarray.length_toList, *]
+      omega
+    · have : ys.size = 0 := by simp [← Subarray.length_toList, *]
+      omega
+    · rename_i x' xs' y' ys' _ _
+      simp +zetaDelta only at *
+      have h₁ : x' = xs[0] := by simp [Subarray.getElem_eq_getElem_toList, *]
+      have h₂ : y' = ys[0] := by simp [Subarray.getElem_eq_getElem_toList, *]
+      cases h₁
+      cases h₂
+      simp [Subarray.toList_drop, *]
+  · rename_i xs ys _ _ _ _ _ _ _
+    rw [List.merge.eq_def]
+    split
+    · have : xs.size = 0 := by simp [← Subarray.length_toList, *]
+      omega
+    · have : ys.size = 0 := by simp [← Subarray.length_toList, *]
+      omega
+    · rename_i x' xs' y' ys' _ _
+      simp +zetaDelta only at *
+      have h₁ : x' = xs[0] := by simp [Subarray.getElem_eq_getElem_toList, *]
+      have h₂ : y' = ys[0] := by simp [Subarray.getElem_eq_getElem_toList, *]
+      cases h₁
+      cases h₂
+      simp [*]
+      have : xs.size = xs'.length + 1 := by simp [← Subarray.length_toList, *]
+      have : xs' = [] := List.eq_nil_of_length_eq_zero (by omega)
+      simp only [this]
+      rw [← Subarray.foldl_toList]
+      simp [*]
+  · rename_i xs ys _ _ _ _ _ _ _ _
+    rw [List.merge.eq_def]
+    split
+    · have : xs.size = 0 := by simp [← Subarray.length_toList, *]
+      omega
+    · have : ys.size = 0 := by simp [← Subarray.length_toList, *]
+      omega
+    · rename_i x' xs' y' ys' _ _
+      simp +zetaDelta only at *
+      have h₁ : x' = xs[0] := by simp [Subarray.getElem_eq_getElem_toList, *]
+      have h₂ : y' = ys[0] := by simp [Subarray.getElem_eq_getElem_toList, *]
+      cases h₁
+      cases h₂
+      simp [Subarray.toList_drop, *]
+  · rename_i xs ys _ _ _ _ _ _ _
+    rw [List.merge.eq_def]
+    split
+    · have : xs.size = 0 := by simp [← Subarray.length_toList, *]
+      omega
+    · have : ys.size = 0 := by simp [← Subarray.length_toList, *]
+      omega
+    · rename_i x' xs' y' ys' _ _
+      simp +zetaDelta only at *
+      have h₁ : x' = xs[0] := by simp [Subarray.getElem_eq_getElem_toList, *]
+      have h₂ : y' = ys[0] := by simp [Subarray.getElem_eq_getElem_toList, *]
+      cases h₁
+      cases h₂
+      simp [*]
+      have : ys.size = ys'.length + 1 := by simp [← Subarray.length_toList, *]
+      have : ys' = [] := List.eq_nil_of_length_eq_zero (by omega)
+      simp [this]
+      rw [← Subarray.foldl_toList]
+      simp [*]
+
+private theorem Array.MergeSort.merge_eq_listMerge {xs ys : Array α} {le} :
+    (Array.MergeSort.Internal.merge xs ys le).toList = List.merge xs.toList ys.toList le := by
+  rw [Array.MergeSort.Internal.merge]
+  split <;> rename_i heq₁
+  · split <;> rename_i heq₂
+    · simp [Array.MergeSort.merge.go_eq_listMerge]
+    · have : ys.toList = [] := by simp_all
+      simp [this]
+  · have : xs.toList = [] := by simp_all
+    simp [this]
+
+private theorem List.mergeSort_eq_merge_mkSlice {xs : List α} :
+    xs.mergeSort le =
+      if 1 < xs.length then
+        merge (xs[*...((xs.length + 1) / 2)].toList.mergeSort le) (xs[((xs.length + 1) / 2)...*].toList.mergeSort le) le
+      else
+        xs := by
+  fun_cases xs.mergeSort le
+  · simp
+  · simp
+  · rename_i x y ys lr hl hr
+    simp [lr]
+
+theorem Subarray.toList_mergeSort {xs : Subarray α} {le : α → α → Bool} :
+    (xs.mergeSort le).toList = xs.toList.mergeSort le := by
+  fun_induction xs.mergeSort le
+  · rw [List.mergeSort_eq_merge_mkSlice]
+    simp +zetaDelta [Array.MergeSort.merge_eq_listMerge, *]
+  · simp [List.mergeSort_eq_merge_mkSlice, *]
+
+@[simp, grind =]
+theorem Subarray.mergeSort_eq_mergeSort_toArray {xs : Subarray α} {le : α → α → Bool} :
+    xs.mergeSort le = xs.toArray.mergeSort le := by
+  simp [← Array.toList_inj, toList_mergeSort, Array.mergeSort]
+
+theorem Subarray.mergeSort_toArray {xs : Subarray α} {le : α → α → Bool} :
+    xs.toArray.mergeSort le = xs.mergeSort le := by
+  simp
+
+theorem Array.toList_mergeSort {xs : Array α} {le : α → α → Bool} :
+    (xs.mergeSort le).toList = xs.toList.mergeSort le := by
+  rw [Array.mergeSort, Subarray.toList_mergeSort, Array.toList_mkSlice_rii]
+
+theorem Array.mergeSort_eq_toArray_mergeSort_toList {xs : Array α} {le : α → α → Bool} :
+    xs.mergeSort le = (xs.toList.mergeSort le).toArray := by
+  simp [← toList_mergeSort]
+
+/-!
+# Basic properties of `Array.mergeSort`.
+
+* `pairwise_mergeSort`: `mergeSort` produces a sorted array.
+* `mergeSort_perm`: `mergeSort` is a permutation of the input array.
+* `mergeSort_of_pairwise`: `mergeSort` does not change a sorted array.
+* `sublist_mergeSort`: if `c` is a sorted sublist of `l`, then `c` is still a sublist of `mergeSort le l`.
+-/
+
+namespace Array
+
+-- Enable this instance locally so we can write `Pairwise le` instead of `Pairwise (le · ·)` everywhere.
+attribute [local instance] boolRelToRel
+
+@[simp] theorem mergeSort_empty : (#[] : Array α).mergeSort r = #[] := by
+  simp [mergeSort_eq_toArray_mergeSort_toList]
+
+@[simp] theorem mergeSort_singleton {a : α} : #[a].mergeSort r = #[a] := by
+  simp [mergeSort_eq_toArray_mergeSort_toList]
+
+theorem mergeSort_perm {xs : Array α} {le} : (xs.mergeSort le).Perm xs := by
+  simpa [mergeSort_eq_toArray_mergeSort_toList, Array.perm_iff_toList_perm] using List.mergeSort_perm _ _
+
+@[simp] theorem size_mergeSort {xs : Array α} : (mergeSort xs le).size = xs.size := by
+  simp [mergeSort_eq_toArray_mergeSort_toList]
+
+@[simp] theorem mem_mergeSort {a : α} {xs : Array α} : a ∈ mergeSort xs le ↔ a ∈ xs := by
+  simp [mergeSort_eq_toArray_mergeSort_toList]
+
+/--
+The result of `Array.mergeSort` is sorted,
+as long as the comparison function is transitive (`le a b → le b c → le a c`)
+and total in the sense that `le a b || le b a`.
+
+The comparison function need not be irreflexive, i.e. `le a b` and `le b a` is allowed even when `a ≠ b`.
+-/
+theorem pairwise_mergeSort
+    (trans : ∀ (a b c : α), le a b → le b c → le a c)
+    (total : ∀ (a b : α), le a b || le b a)
+    {xs : Array α} :
+    (mergeSort xs le).toList.Pairwise (le · ·) := by
+  simpa [mergeSort_eq_toArray_mergeSort_toList] using List.pairwise_mergeSort trans total _
+
+/--
+If the input array is already sorted, then `mergeSort` does not change the array.
+-/
+theorem mergeSort_of_pairwise {le : α → α → Bool} {xs : Array α} (_ : xs.toList.Pairwise (le · ·)) :
+    mergeSort xs le = xs := by
+  simpa [mergeSort_eq_toArray_mergeSort_toList, List.toArray_eq_iff] using List.mergeSort_of_pairwise ‹_›
+
+/--
+This merge sort algorithm is stable,
+in the sense that breaking ties in the ordering function using the position in the array
+has no effect on the output.
+
+That is, elements which are equal with respect to the ordering function will remain
+in the same order in the output array as they were in the input array.
+
+See also:
+* `sublist_mergeSort`: if `c <+ l` and `c.Pairwise le`, then `c <+ (mergeSort le l).toList`.
+* `pair_sublist_mergeSort`: if `[a, b] <+ l` and `le a b`, then `[a, b] <+ (mergeSort le l).toList`)
+-/
+theorem mergeSort_zipIdx {xs : Array α} :
+    (mergeSort (xs.zipIdx.map fun (a, i) => (a, i)) (List.zipIdxLE le)).map (·.1) = mergeSort xs le := by
+  simpa [mergeSort_eq_toArray_mergeSort_toList, Array.toList_zipIdx] using List.mergeSort_zipIdx
+
+/--
+Another statement of stability of merge sort.
+If `c` is a sorted sublist of `xs.toList`,
+then `c` is still a sublist of `(mergeSort le xs).toList`.
+-/
+theorem sublist_mergeSort {le : α → α → Bool}
+    (trans : ∀ (a b c : α), le a b → le b c → le a c)
+    (total : ∀ (a b : α), le a b || le b a)
+    {ys : List α} (_ : ys.Pairwise (le · ·)) (_ : List.Sublist ys xs.toList) :
+    List.Sublist ys (mergeSort xs le).toList := by
+  simpa [mergeSort_eq_toArray_mergeSort_toList, Array.toList_zipIdx] using
+    List.sublist_mergeSort trans total ‹_› ‹_›
+
+/--
+Another statement of stability of merge sort.
+If a pair `[a, b]` is a sublist of `xs.toList` and `le a b`,
+then `[a, b]` is still a sublist of `(mergeSort le xs).toList`.
+-/
+theorem pair_sublist_mergeSort
+    (trans : ∀ (a b c : α), le a b → le b c → le a c)
+    (total : ∀ (a b : α), le a b || le b a)
+    (hab : le a b) (h : List.Sublist [a, b] xs.toList) :
+    List.Sublist [a, b] (mergeSort xs le).toList := by
+  simpa [mergeSort_eq_toArray_mergeSort_toList, Array.toList_zipIdx] using
+    List.pair_sublist_mergeSort trans total ‹_› ‹_›
+
+theorem map_mergeSort {r : α → α → Bool} {s : β → β → Bool} {f : α → β}
+    {xs : Array α} (hxs : ∀ a ∈ xs, ∀ b ∈ xs, r a b = s (f a) (f b)) :
+    (xs.mergeSort r).map f = (xs.map f).mergeSort s := by
+  simp only [mergeSort_eq_toArray_mergeSort_toList, List.map_toArray, toList_map, mk.injEq]
+  apply List.map_mergeSort
+  simpa
+
+end Array
--- a/src/Init/Data/Array/Subarray.lean
+++ b/src/Init/Data/Array/Subarray.lean
@@ -7,7 +7,7 @@ module

 prelude
 public import Init.Data.Array.Basic
-public import Init.Data.Slice.Basic
+public import Init.Data.Slice.Operations

 public section

@@ -76,15 +76,17 @@ def Subarray.stop_le_array_size (xs : Subarray α) : xs.stop ≤ xs.array.size :

 namespace Subarray

-/--
-Computes the size of the subarray.
-/
-def size (s : Subarray α) : Nat :=
-  s.stop - s.start
+instance : SliceSize (Internal.SubarrayData α) where
+  size s := s.internalRepresentation.stop - s.internalRepresentation.start
+
+@[grind =, suggest_for Subarray.size]
+public theorem size_eq {xs : Subarray α} :
+    xs.size = xs.stop - xs.start := by
+  simp [Std.Slice.size, SliceSize.size, start, stop]

 theorem size_le_array_size {s : Subarray α} : s.size ≤ s.array.size := by
  let ⟨{array, start, stop, start_le_stop, stop_le_array_size}⟩ := s
-  simp only [size, ge_iff_le]
+  simp only [ge_iff_le, size_eq]
  apply Nat.le_trans (Nat.sub_le stop start)
  assumption

--- a/src/Init/Data/BitVec/Bitblast.lean
+++ b/src/Init/Data/BitVec/Bitblast.lean
@@ -2393,4 +2393,412 @@ theorem fastUmulOverflow (x y : BitVec w) :
        simp [← Nat.pow_add, show w + 1 - (k - 1) + k = w + 1 + 1 by omega] at this
        omega

+/-! ### Population Count -/
+
+/-- Extract the `k`-th bit from `x` and extend it to have length `len`. -/
+def extractAndExtendBit (idx len : Nat) (x : BitVec w) : BitVec len :=
+  BitVec.zeroExtend len (BitVec.extractLsb' idx 1 x)
+
+
+/-- Recursively extract one bit at a time and extend it to width `w` -/
+def extractAndExtendAux (k len : Nat) (x : BitVec w) (acc : BitVec (k * len)) (hle : k ≤ w) :
+    BitVec (w * len) :=
+  match hwi : w - k with
+  | 0 => acc.cast (by simp [show w = k by omega])
+  | n' + 1 =>
+    let acc' := extractAndExtendBit k len x ++ acc
+    extractAndExtendAux (k + 1) len x (acc'.cast (by simp [Nat.add_mul]; omega)) (by omega)
+termination_by w - k
+
+/-- We instantiate `extractAndExtendAux` to extend each bit to `len`, extending
+  each bit in `x` to have width `w` and returning a `BitVec (w * w)`. -/
+def extractAndExtend (len : Nat) (x : BitVec w) : BitVec (w * len) :=
+  extractAndExtendAux 0 len x ((0#0).cast (by simp)) (by omega)
+
+/--
+  Construct a layer of the parallel-prefix-sum tree by summing two-by-two all the
+  `w`-long words in `oldLayer`, returning a bitvector containing `(oldLen + 1) / 2`
+  flattened `w`-long words, each resulting from an addition.
+-/
+def cpopLayer (oldLayer : BitVec (len * w)) (newLayer : BitVec (iterNum * w))
+    (hold : 2 * (iterNum - 1) < len) : BitVec (((len + 1)/2) * w) :=
+  if hlen : len - (iterNum * 2) = 0 then
+    have : ((len + 1)/2) = iterNum := by omega
+    newLayer.cast (by simp [this])
+  else
+    let op1 := oldLayer.extractLsb' ((2 * iterNum) * w) w
+    let op2 := oldLayer.extractLsb' ((2 * iterNum + 1) * w) w
+    let newLayer' := (op1 + op2) ++ newLayer
+    have hcast : w + iterNum * w = (iterNum + 1) * w := by simp [Nat.add_mul]; omega
+    cpopLayer oldLayer (newLayer'.cast hcast) (by omega)
+termination_by len - (iterNum * 2)
+
+/--
+  Given a `BitVec (len * w)` of `len` flattened `w`-long words,
+  construct a binary tree that sums two-by-two the `w`-long words in the previous layer,
+  ultimately returning a single `w`-long words corresponding to the whole addition.
+-/
+def cpopTree (l : BitVec (len * w)) : BitVec w :=
+  if h : len = 0 then 0#w
+  else if h : len = 1 then
+    l.cast (by simp [h])
+  else
+    cpopTree (cpopLayer l 0#(0 * w) (by omega))
+termination_by len
+
+/--
+  Given flattened bitvector `x : BitVec w` and a length `l : Nat`,
+  construct a parallel prefix sum circuit adding each available `l`-long word in `x`.
+-/
+def cpopRec (x : BitVec w) : BitVec w :=
+  if hw : 1 < w then
+    let extendedBits := x.extractAndExtend w
+    (cpopTree extendedBits).cast (by simp)
+  else if hw' : 0 < w then
+    x
+  else
+    0#w
+
+/-- Recursive addition of the elements in a flattened bitvec, starting from the `rem`-th element. -/
+private def addRecAux (x : BitVec (l * w)) (rem : Nat) (acc : BitVec w) : BitVec w :=
+  match rem with
+  | 0 => acc
+  | n + 1 => x.addRecAux n (acc + x.extractLsb' (n * w) w)
+
+/-- Recursive addition of the elements in a flattened bitvec. -/
+private def addRec (x : BitVec (l * w)) : BitVec w := addRecAux x l 0#w
+
+theorem getLsbD_extractAndExtendBit {x : BitVec w} :
+    (extractAndExtendBit k len x).getLsbD i =
+    (decide (i = 0) && decide (0 < len) && x.getLsbD k) := by
+  simp only [extractAndExtendBit, truncate_eq_setWidth, getLsbD_setWidth, getLsbD_extractLsb',
+    Nat.lt_one_iff]
+  by_cases hi : i = 0
+  <;> simp [hi]
+
+@[simp]
+private theorem extractAndExtendAux_zero {k len : Nat} {x : BitVec w}
+    {acc : BitVec (k * len)} (heq : w = k) :
+    extractAndExtendAux k len x acc (by omega) = acc.cast (by simp [heq]) := by
+  unfold extractAndExtendAux
+  split
+  · simp
+  · omega
+
+private theorem extractLsb'_extractAndExtendAux {k len : Nat} {x : BitVec w}
+    (acc : BitVec (k * len)) (hle : k ≤ w) :
+    (∀ i (_ : i < k), acc.extractLsb' (i * len) len = (x.extractLsb' i 1).setWidth len) →
+    (extractAndExtendAux k len x acc (by omega)).extractLsb' (i * len) len =
+    (x.extractLsb' i 1).setWidth len := by
+  intros hacc
+  induction hwi : w - k generalizing acc k
+  · case zero =>
+    rw [extractAndExtendAux_zero (by omega)]
+    by_cases hj : i < k
+    · apply hacc
+      exact hj
+    · ext l hl
+      have := mul_le_mul_right (n := k) (m := i) len (by omega)
+      simp [← getLsbD_eq_getElem, getLsbD_extractLsb', hl, getLsbD_setWidth,
+        show w ≤ i + l by omega, getLsbD_of_ge acc (i * len + l) (by omega)]
+  · case succ n' ihn' =>
+    rw [extractAndExtendAux]
+    split
+    · omega
+    · apply ihn'
+      · intros i hi
+        have hcast : len + k * len = (k + 1) * len := by
+            simp [Nat.mul_comm, Nat.mul_add, Nat.add_comm]
+
+        by_cases hi' : i < k
+        · have heq : extractLsb' (i * len) len (BitVec.cast hcast (extractAndExtendBit k len x ++ acc))  =
+              extractLsb' (i * len) len ((extractAndExtendBit k len x ++ acc)) := by
+            ext; simp
+          rw [heq, extractLsb'_append_of_lt hi']
+          apply hacc
+          exact hi'
+        · have heq : extractLsb' (i * len) len (BitVec.cast hcast (extractAndExtendBit k len x ++ acc))   =
+              extractLsb' (i * len) len ((extractAndExtendBit k len x ++ acc)) := by
+            ext; simp
+          rw [heq, extractLsb'_append_of_eq (by omega)]
+          simp [show i = k by omega, extractAndExtendBit]
+      · omega
+
+theorem extractLsb'_cpopLayer {w iterNum i oldLen : Nat} {oldLayer : BitVec (oldLen * w)}
+    {newLayer : BitVec (iterNum * w)} (hold : 2 * (iterNum - 1) < oldLen) :
+    (∀ i (_hi: i < iterNum),
+      newLayer.extractLsb' (i * w) w =
+      oldLayer.extractLsb' ((2 * i) * w) w + (oldLayer.extractLsb' ((2 * i + 1) * w) w)) →
+    extractLsb' (i * w) w (oldLayer.cpopLayer newLayer hold) =
+      extractLsb' (2 * i * w) w oldLayer + extractLsb' ((2 * i + 1) * w) w oldLayer := by
+  intro proof_addition
+  rw [cpopLayer]
+  split
+  · by_cases hi : i < iterNum
+    · simp only [extractLsb'_cast]
+      apply proof_addition
+      exact hi
+    · ext j hj
+      have : iterNum * w ≤ i * w := by refine mul_le_mul_right w (by omega)
+      have : oldLen * w ≤ (2 * i) * w := by refine mul_le_mul_right w (by omega)
+      have : oldLen * w ≤ (2 * i + 1) * w := by refine mul_le_mul_right w (by omega)
+      have hz : extractLsb' (2 * i * w) w oldLayer = 0#w := by
+        ext j hj
+        simp [show oldLen * w ≤ 2 * i * w + j by omega]
+      have hz' : extractLsb' ((2 * i + 1) * w) w oldLayer = 0#w := by
+        ext j hj
+        simp [show oldLen * w ≤ (2 * i + 1) * w + j by omega]
+      simp [show iterNum * w ≤ i * w + j by omega, hz, hz']
+  · generalize hop1 : oldLayer.extractLsb' ((2 * iterNum) * w) w = op1
+    generalize hop2 : oldLayer.extractLsb' ((2 * iterNum + 1) * w) w = op2
+    have hcast : w + iterNum * w = (iterNum + 1) * w := by simp [Nat.add_mul]; omega
+    apply extractLsb'_cpopLayer
+    intros i hi
+    by_cases hlt : i < iterNum
+    · rw [extractLsb'_cast, extractLsb'_append_eq_of_add_le]
+      · apply proof_addition
+        exact hlt
+      · rw [show i * w + w = i * w + 1 * w by omega, ← Nat.add_mul]
+        exact mul_le_mul_right w hlt
+    · rw [extractLsb'_cast, show i = iterNum by omega, extractLsb'_append_eq_left, hop1, hop2]
+termination_by oldLen - 2 * (iterNum + 1 - 1)
+
+theorem getLsbD_cpopLayer {w iterNum: Nat} {oldLayer : BitVec (oldLen * w)}
+    {newLayer : BitVec (iterNum * w)} (hold : 2 * (iterNum - 1) < oldLen) :
+    (∀ i (_hi: i < iterNum),
+          newLayer.extractLsb' (i * w) w =
+          oldLayer.extractLsb' ((2 * i) * w) w + (oldLayer.extractLsb' ((2 * i + 1) * w) w)) →
+    (oldLayer.cpopLayer newLayer hold).getLsbD k =
+      (extractLsb' (2 * ((k - k % w) / w) * w) w oldLayer +
+        extractLsb' ((2 * ((k - k % w) / w) + 1) * w) w oldLayer).getLsbD (k % w) := by
+  intro proof_addition
+  by_cases hw0 : w = 0
+  · subst hw0
+    simp
+  · simp only [← extractLsb'_cpopLayer (hold := by omega) proof_addition,
+      Nat.mod_lt (x := k) (y := w) (by omega), getLsbD_eq_getElem, getElem_extractLsb']
+    congr
+    by_cases hmod : k % w = 0
+    · rw [hmod, Nat.sub_zero, Nat.add_zero, Nat.div_mul_cancel (by omega)]
+    · rw [Nat.div_mul_cancel (by exact dvd_sub_mod k), Nat.sub_add_cancel (by exact mod_le k w)]
+
+@[simp]
+private theorem addRecAux_zero {x : BitVec (l * w)} {acc : BitVec w} :
+    x.addRecAux 0 acc = acc := rfl
+
+@[simp]
+private theorem addRecAux_succ {x : BitVec (l * w)} {n : Nat} {acc : BitVec w} :
+    x.addRecAux (n + 1) acc = x.addRecAux n (acc + extractLsb' (n * w) w x) := rfl
+
+private theorem addRecAux_eq {x : BitVec (l * w)} {n : Nat} {acc : BitVec w} :
+     x.addRecAux n acc = x.addRecAux n 0#w + acc := by
+  induction n generalizing acc
+  · case zero =>
+    simp
+  · case succ n ihn =>
+    simp only [addRecAux_succ, BitVec.zero_add, ihn (acc := extractLsb' (n * w) w x),
+      BitVec.add_assoc, ihn (acc := acc + extractLsb' (n * w) w x), BitVec.add_right_inj]
+    rw [BitVec.add_comm (x := acc)]
+
+private theorem extractLsb'_addRecAux_of_le {x : BitVec (len * w)} (h : r ≤ k):
+    (extractLsb' 0 (k * w) x).addRecAux r 0#w = x.addRecAux r 0#w := by
+  induction r generalizing x len k
+  · case zero =>
+    simp [addRecAux]
+  · case succ diff ihdiff =>
+    simp only [addRecAux_succ, BitVec.zero_add]
+    have hext : diff * w + w ≤ k * w := by
+      simp only [show diff * w + w = (diff + 1) * w by simp [Nat.add_mul]]
+      exact Nat.mul_le_mul_right w h
+    rw [extractLsb'_extractLsb'_of_le hext, addRecAux_eq (x := x),
+        addRecAux_eq (x := extractLsb' 0 (k * w) x), ihdiff (x := x) (by omega) (k := k)]
+
+private theorem extractLsb'_extractAndExtend_eq {i len : Nat} {x : BitVec w} :
+    (extractAndExtend len x).extractLsb' (i * len) len = extractAndExtendBit i len x := by
+  unfold extractAndExtend
+  by_cases hilt : i < w
+  · ext j hj
+    simp [extractLsb'_extractAndExtendAux, extractAndExtendBit]
+  · ext k hk
+    have := Nat.mul_le_mul_right (n := w) (k := len) (m := i) (by omega)
+    simp only [extractAndExtendBit, cast_ofNat, getElem_extractLsb', truncate_eq_setWidth,
+      getElem_setWidth, getLsbD_extractLsb', Nat.lt_one_iff]
+    rw [getLsbD_of_ge, getLsbD_of_ge]
+    · simp
+    · omega
+    · omega
+
+private theorem addRecAux_append_extractLsb' {x : BitVec (len * w)} (ha : 0 < len) :
+    ((x.extractLsb' ((len - 1) * w) w ++
+      x.extractLsb' 0 ((len - 1) * w)).cast (m := len * w) hcast).addRecAux len 0#w =
+    x.extractLsb' ((len - 1) * w) w +
+      (x.extractLsb' 0 ((len - 1) * w)).addRecAux (len - 1) 0#w := by
+  simp only [extractLsb'_addRecAux_of_le (k := len - 1) (r := len - 1) (by omega),
+    BitVec.append_extractLsb'_of_lt (hcast := hcast)]
+  have hsucc := addRecAux_succ (x := x) (acc := 0#w) (n := len - 1)
+  rw [BitVec.zero_add, Nat.sub_one_add_one (by omega)] at hsucc
+  rw [hsucc, addRecAux_eq, BitVec.add_comm]
+
+private theorem Nat.mul_add_le_mul_of_succ_le {a b c : Nat} (h : a + 1 ≤ c) :
+    a * b + b ≤ c * b := by
+  rw [← Nat.succ_mul]
+  exact mul_le_mul_right b h
+
+/--
+  The recursive addition of `w`-long words on two flattened bitvectors `x` and `y` (with different
+  number of words `len` and `len'`, respectively) returns the same value, if we can prove
+  that each `w`-long word in `x` results from the addition of two `w`-long words in `y`,
+  using exactly all `w`-long words in `y`.
+-/
+private theorem addRecAux_eq_of {x : BitVec (len * w)} {y : BitVec (len' * w)}
+    (hlen : len = (len' + 1) / 2) :
+    (∀ (i : Nat) (_h : i < (len' + 1) / 2),
+      extractLsb' (i * w) w x = extractLsb' (2 * i * w) w y + extractLsb' ((2 * i + 1) * w) w y) →
+    x.addRecAux len 0#w = y.addRecAux len' 0#w := by
+  intro hadd
+  induction len generalizing len' y
+  · case zero =>
+    simp [show len' = 0 by omega]
+  · case succ len ih =>
+    have hcast : w + (len + 1 - 1) * w = (len + 1) * w := by
+      simp [Nat.add_mul, Nat.add_comm]
+    have hcast' :  w + (len' - 1) * w = len' * w := by
+      rw [Nat.sub_mul, Nat.one_mul,
+        ← Nat.add_sub_assoc (by refine Nat.le_mul_of_pos_left w (by omega)), Nat.add_comm]
+      simp
+    rw [addRecAux_succ, ← BitVec.append_extractLsb'_of_lt (x := x) (hcast := hcast)]
+    have happ := addRecAux_append_extractLsb' (len := len + 1) (x := x) (hcast := hcast) (by omega)
+    simp only [Nat.add_one_sub_one, addRecAux_succ, BitVec.zero_add] at happ
+    simp only [Nat.add_one_sub_one, BitVec.zero_add, happ]
+    have := Nat.succ_mul (n := len' - 1) (m := w)
+    rw [succ_eq_add_one, Nat.sub_one_add_one (by omega)] at this
+    by_cases hmod : len' % 2 = 0
+    · /- `sum` results from the addition of the two last elements in `y`, `sum = op1 + op2` -/
+      have := Nat.mul_le_mul_right (n := len' - 1 - 1) (m := len' - 1) (k := w) (by omega)
+      have := Nat.succ_mul (n := len' - 1 - 1) (m := w)
+      have hcast'' :  w + (len' - 1 - 1) * w = (len' - 1) * w := by
+        rw [Nat.sub_mul, Nat.one_mul,
+          ← Nat.add_sub_assoc (k := w) (by refine Nat.le_mul_of_pos_left w (by omega))]
+        simp
+      rw [succ_eq_add_one, Nat.sub_one_add_one (by omega)] at this
+      rw [← BitVec.append_extractLsb'_of_lt (x := y) (hcast := hcast'),
+        addRecAux_append_extractLsb' (by omega),
+        ← BitVec.append_extractLsb'_of_lt (x := extractLsb' 0 ((len' - 1) * w) y) (hcast := hcast''),
+        addRecAux_append_extractLsb' (by omega),
+        extractLsb'_extractLsb'_of_le (by exact Nat.mul_add_le_mul_of_succ_le (by omega)),
+        extractLsb'_extractLsb'_of_le (by omega), ← BitVec.add_assoc, hadd (_h := by omega)]
+      congr 1
+      · rw [show len = (len' + 1) / 2 - 1 by omega, BitVec.add_comm]
+        congr <;> omega
+      · apply ih
+        · omega
+        · intros
+          rw [extractLsb'_extractLsb'_of_le (by exact Nat.mul_add_le_mul_of_succ_le (by omega)),
+            extractLsb'_extractLsb'_of_le (by exact Nat.mul_add_le_mul_of_succ_le (by omega)),
+            extractLsb'_extractLsb'_of_le (by exact Nat.mul_add_le_mul_of_succ_le (by omega)),
+            hadd (_h := by omega)]
+    · /- `sum` results from the addition of the last elements in `y` with `0#w` -/
+      have : len' * w ≤ (len' - 1 + 1) * w := by exact mul_le_mul_right w (by omega)
+      rw [← BitVec.append_extractLsb'_of_lt (x := y) (hcast := hcast'),
+        addRecAux_append_extractLsb' (by omega), hadd (_h := by omega),
+        show 2 * len = len' - 1 by omega]
+      congr 1
+      · rw [BitVec.add_right_eq_self]
+        ext k hk
+        simp only [getElem_extractLsb', getElem_zero]
+        apply getLsbD_of_ge y ((len' - 1 + 1) * w + k) (by omega)
+      · apply ih
+        · omega
+        · intros
+          rw [extractLsb'_extractLsb'_of_le (by exact Nat.mul_add_le_mul_of_succ_le (by omega)),
+            extractLsb'_extractLsb'_of_le (by exact Nat.mul_add_le_mul_of_succ_le (by omega)),
+            extractLsb'_extractLsb'_of_le (by exact Nat.mul_add_le_mul_of_succ_le (by omega)),
+            hadd (_h := by omega)]
+
+private theorem getLsbD_extractAndExtend_of_lt {x : BitVec w} (hk : k < v) :
+    (x.extractAndExtend v).getLsbD (pos * v + k) = (extractAndExtendBit pos v x).getLsbD k := by
+  simp [← extractLsb'_extractAndExtend_eq (w := w) (len := v) (i := pos) (x := x)]
+  omega
+
+/--
+  Extracting a bit from a `BitVec.extractAndExtend` is the same as extracting a bit
+  from a zero-extended bit at a certain position in the original bitvector.
+-/
+theorem getLsbD_extractAndExtend {x : BitVec w} (hv : 0 < v) :
+    (BitVec.extractAndExtend v x).getLsbD k =
+    (BitVec.extractAndExtendBit ((k - (k % v)) / v) v x).getLsbD (k % v):= by
+  rw [← getLsbD_extractAndExtend_of_lt (by exact mod_lt k hv)]
+  congr
+  by_cases hmod : k % v = 0
+  · simp only [hmod, Nat.sub_zero, Nat.add_zero]
+    rw [Nat.div_mul_cancel (by omega)]
+  · rw [← Nat.div_eq_sub_mod_div]
+    exact Eq.symm (div_add_mod' k v)
+
+private theorem addRecAux_extractAndExtend_eq_cpopNatRec {x : BitVec w} :
+    (extractAndExtend w x).addRecAux n 0#w = x.cpopNatRec n 0 := by
+  induction n
+  · case zero =>
+    simp
+  · case succ n' ihn' =>
+    rw [cpopNatRec_succ, Nat.zero_add, natCast_eq_ofNat, addRecAux_succ, BitVec.zero_add,
+      addRecAux_eq, cpopNatRec_eq, ihn', ofNat_add, natCast_eq_ofNat, BitVec.add_right_inj,
+      extractLsb'_extractAndExtend_eq]
+    ext k hk
+    simp only [extractAndExtendBit, ← getLsbD_eq_getElem, getLsbD_ofNat, hk, decide_true,
+      Bool.true_and, truncate_eq_setWidth, getLsbD_setWidth, getLsbD_extractLsb', Nat.lt_one_iff]
+    by_cases hk0 : k = 0
+    · simp only [hk0, testBit_zero, decide_true, Nat.add_zero, Bool.true_and]
+      cases x.getLsbD n' <;> simp
+    · simp only [show ¬k = 0 by omega, decide_false, Bool.false_and]
+      symm
+      apply testBit_lt_two_pow ?_
+      have : (x.getLsbD n').toNat ≤ 1 := by
+        cases x.getLsbD n' <;> simp
+      have : 1 < 2 ^ k := by exact Nat.one_lt_two_pow hk0
+      omega
+
+private theorem addRecAux_extractAndExtend_eq_cpop {x : BitVec w} :
+    (extractAndExtend w x).addRecAux w 0#w = x.cpop := by
+  simp only [cpop]
+  apply addRecAux_extractAndExtend_eq_cpopNatRec
+
+private theorem addRecAux_cpopTree {x : BitVec (len * w)} :
+    addRecAux ((cpopTree x).cast (m := 1 * w) (by simp)) 1 0#w = addRecAux x len 0#w := by
+  unfold cpopTree
+  split
+  · case _ h =>
+    subst h
+    simp [addRecAux]
+  · case _ h =>
+    split
+    · case _ h' =>
+      simp only [addRecAux_succ, Nat.zero_mul, BitVec.zero_add, addRecAux_zero, h']
+      ext; simp
+    · rw [addRecAux_cpopTree]
+      apply BitVec.addRecAux_eq_of (x := cpopLayer x 0#(0 * w) (by omega)) (y := x)
+      · rfl
+      · intros j hj
+        simp [extractLsb'_cpopLayer]
+termination_by len
+
+private theorem addRecAux_eq_cpopTree {x : BitVec (len * w)} :
+    x.addRecAux len 0#w = (x.cpopTree).cast (by simp) := by
+  rw [← addRecAux_cpopTree, addRecAux_succ, Nat.zero_mul, BitVec.zero_add, addRecAux_zero]
+  ext k hk
+  simp [← getLsbD_eq_getElem, hk]
+
+theorem cpop_eq_cpopRec {x : BitVec w} :
+    BitVec.cpop x = BitVec.cpopRec x := by
+  unfold BitVec.cpopRec
+  split
+  · simp [← addRecAux_extractAndExtend_eq_cpop, addRecAux_eq_cpopTree (x := extractAndExtend w x)]
+  · split
+    · ext k hk
+      cases hx : x.getLsbD 0
+      <;> simp [hx, cpop, ← getLsbD_eq_getElem, show k = 0 by omega, show w = 1 by omega]
+    · have hw : w = 0 := by omega
+      subst hw
+      simp [of_length_zero]
+
 end BitVec
--- a/src/Init/Data/BitVec/Lemmas.lean
+++ b/src/Init/Data/BitVec/Lemmas.lean
@@ -1198,7 +1198,7 @@ let x' = x.extractLsb' 7 5  =   _ _ 9 8 7
      (decide (0 < len) &&
      (decide (start + len ≤ w) &&
      x.getMsbD (w - (start + len)))) := by
-  simp [BitVec.msb, getMsbD_extractLsb']; rfl
+  simp [BitVec.msb, getMsbD_extractLsb']

@[simp, grind =] theorem getElem_extract {hi lo : Nat} {x : BitVec n} {i : Nat} (h : i < hi - lo + 1) :
    (extractLsb hi lo x)[i] = getLsbD x (lo+i) := by
@@ -1234,7 +1234,7 @@ let x' = x.extractLsb' 7 5  =   _ _ 9 8 7

@[simp, grind =] theorem msb_extractLsb {hi lo : Nat} {x : BitVec w} :
    (extractLsb hi lo x).msb = (decide (max hi lo < w) && x.getMsbD (w - 1 - max hi lo)) := by
-  simp [BitVec.msb]; rfl
+  simp [BitVec.msb]

 theorem extractLsb'_eq_extractLsb {w : Nat} (x : BitVec w) (start len : Nat) (h : len > 0) :
    x.extractLsb' start len = (x.extractLsb (len - 1 + start) start).cast (by omega) := by
@@ -2784,7 +2784,14 @@ theorem msb_append {x : BitVec w} {y : BitVec v} :
@[simp] theorem append_zero_width (x : BitVec w) (y : BitVec 0) : x ++ y = x := by
  ext i ih
  rw [getElem_append] -- Why does this not work with `simp [getElem_append]`?
-  simp; rfl
+  simp
+
+theorem append_of_zero_width (x : BitVec w) (y : BitVec v) (h : w = 0) :
+    (x ++ y) = y.cast (by simp [h]) := by
+  ext i ih
+  subst h
+  simp [← getLsbD_eq_getElem, getLsbD_append]
+  omega

 set_option backward.isDefEq.respectTransparency false in
@[grind =]
@@ -3013,6 +3020,34 @@ theorem extractLsb'_append_extractLsb'_eq_extractLsb' {x : BitVec w} (h : start
  congr 1
  omega

+theorem append_extractLsb'_of_lt {x : BitVec (x_len * w)} :
+    (x.extractLsb' ((x_len - 1) * w) w ++ x.extractLsb' 0 ((x_len - 1) * w)).cast hcast = x := by
+  ext i hi
+  simp only [getElem_cast, getElem_append, getElem_extractLsb', Nat.zero_add, dite_eq_ite]
+  rw [← getLsbD_eq_getElem, ite_eq_left_iff, Nat.not_lt]
+  intros
+  simp only [show (x_len - 1) * w + (i - (x_len - 1) * w) = i by omega]
+
+
+theorem extractLsb'_append_of_lt {x : BitVec (k * w)} {y : BitVec w} (hlt : i < k) :
+    extractLsb' (i * w) w (y ++ x) = extractLsb' (i * w) w x := by
+  ext j hj
+  simp only [← getLsbD_eq_getElem, getLsbD_extractLsb', hj, decide_true, getLsbD_append,
+    Bool.true_and, ite_eq_left_iff, Nat.not_lt]
+  intros h
+  by_cases hw0 : w = 0
+  · subst hw0
+    simp
+  · have : i * w ≤ (k - 1) * w := Nat.mul_le_mul_right w (by omega)
+    have h' : i * w + j < (k - 1 + 1) * w := by simp [Nat.add_mul]; omega
+    rw [Nat.sub_one_add_one (by omega)] at h'
+    omega
+
+theorem extractLsb'_append_of_eq {x : BitVec (k * w)} {y : BitVec w} (heq : i = k) :
+    extractLsb' (i * w) w (y ++ x) = y := by
+  ext j hj
+  simp [← getLsbD_eq_getElem, getLsbD_append, hj, heq]
+
 /-- Combine adjacent `~~~ (extractLsb _)'` operations into a single `~~~ (extractLsb _)'`. -/
 theorem not_extractLsb'_append_not_extractLsb'_eq_not_extractLsb' {x : BitVec w} (h : start₂ = start₁ + len₁) :
    (~~~ (x.extractLsb' start₂ len₂) ++ ~~~ (x.extractLsb' start₁ len₁)) =
@@ -5292,7 +5327,6 @@ theorem and_one_eq_setWidth_ofBool_getLsbD {x : BitVec w} :
 theorem replicate_zero {x : BitVec w} : x.replicate 0 = 0#0 := by
  simp [replicate]

-set_option backward.isDefEq.respectTransparency false in
@[simp, grind =]
 theorem replicate_one {w : Nat} {x : BitVec w} :
    (x.replicate 1) = x.cast (by rw [Nat.mul_one]) := by
@@ -5344,7 +5378,6 @@ theorem append_assoc {x₁ : BitVec w₁} {x₂ : BitVec w₂} {x₃ : BitVec w
 theorem append_assoc' {x₁ : BitVec w₁} {x₂ : BitVec w₂} {x₃ : BitVec w₃} :
    (x₁ ++ (x₂ ++ x₃)) = ((x₁ ++ x₂) ++ x₃).cast (by omega) := by simp [append_assoc]

-set_option backward.isDefEq.respectTransparency false in
 theorem replicate_append_self {x : BitVec w} :
    x ++ x.replicate n = (x.replicate n ++ x).cast (by omega) := by
  induction n with
--- a/src/Init/Data/Bool.lean
+++ b/src/Init/Data/Bool.lean
@@ -629,6 +629,7 @@ export Bool (cond_eq_if cond_eq_ite xor and or not)
 This should not be turned on globally as an instance because it degrades performance in Mathlib,
 but may be used locally.
 -/
+@[implicit_reducible]
 def boolPredToPred : Coe (α → Bool) (α  → Prop) where
  coe r := fun a => Eq (r a) true

@@ -636,7 +637,7 @@ def boolPredToPred : Coe (α → Bool) (α  → Prop) where
 This should not be turned on globally as an instance because it degrades performance in Mathlib,
 but may be used locally.
 -/
-@[expose, instance_reducible] def boolRelToRel : Coe (α → α → Bool) (α → α → Prop) where
+@[expose, implicit_reducible] def boolRelToRel : Coe (α → α → Bool) (α → α → Prop) where
  coe r := fun a b => Eq (r a b) true

 /-! ### subtypes -/
--- a/src/Init/Data/ByteArray/Basic.lean
+++ b/src/Init/Data/ByteArray/Basic.lean
@@ -469,5 +469,3 @@ def prevn : Iterator → Nat → Iterator

 end Iterator
 end ByteArray
-
-instance : ToString ByteArray := ⟨fun bs => bs.toList.toString⟩
--- a/src/Init/Data/Char/Basic.lean
+++ b/src/Init/Data/Char/Basic.lean
@@ -129,6 +129,14 @@ The ASCII digits are the following: `0123456789`.
@[inline] def isDigit (c : Char) : Bool :=
  c.val ≥ '0'.val && c.val ≤ '9'.val

+/--
+Returns `true` if the character is an ASCII hexadecimal digit.
+
+The ASCII hexadecimal digits are the following: `0123456789abcdefABCDEF`.
+-/
+@[inline] def isHexDigit (c : Char) : Bool :=
+  c.isDigit || (c.val ≥ 'a'.val && c.val ≤ 'f'.val) || (c.val ≥ 'A'.val && c.val ≤ 'F'.val)
+
 /--
 Returns `true` if the character is an ASCII letter or digit.

--- a/src/Init/Data/Char/Lemmas.lean
+++ b/src/Init/Data/Char/Lemmas.lean
@@ -50,7 +50,7 @@ instance ltTrans : Trans (· < · : Char → Char → Prop) (· < ·) (· < ·)
  trans := Char.lt_trans

 -- This instance is useful while setting up instances for `String`.
-@[instance_reducible]
+@[implicit_reducible]
 def notLTTrans : Trans (¬ · < · : Char → Char → Prop) (¬ · < ·) (¬ · < ·) where
  trans h₁ h₂ := by simpa using Char.le_trans (by simpa using h₂) (by simpa using h₁)

@@ -62,7 +62,7 @@ instance ltTrichotomous : Std.Trichotomous (· < · : Char → Char → Prop) wh
  trichotomous _ _ h₁ h₂ := Char.le_antisymm (by simpa using h₂) (by simpa using h₁)

@[deprecated ltTrichotomous (since := "2025-10-27")]
-def notLTAntisymm : Std.Antisymm (¬ · < · : Char → Char → Prop) where
+theorem notLTAntisymm : Std.Antisymm (¬ · < · : Char → Char → Prop) where
  antisymm := Char.ltTrichotomous.trichotomous

 instance ltAsymm : Std.Asymm (· < · : Char → Char → Prop) where
@@ -73,7 +73,7 @@ instance leTotal : Std.Total (· ≤ · : Char → Char → Prop) where

 -- This instance is useful while setting up instances for `String`.
@[deprecated ltAsymm (since := "2025-08-01")]
-def notLTTotal : Std.Total (¬ · < · : Char → Char → Prop) where
+theorem notLTTotal : Std.Total (¬ · < · : Char → Char → Prop) where
  total := fun x y => by simpa using Char.le_total y x

@[simp] theorem ofNat_toNat (c : Char) : Char.ofNat c.toNat = c := by
--- a/src/Init/Data/Fin/Fold.lean
+++ b/src/Init/Data/Fin/Fold.lean
@@ -164,7 +164,7 @@ theorem foldlM_add [Monad m] [LawfulMonad m] (f : α → Fin (n + k) → m α) :
    simp
  | succ k ih =>
    funext x
-    simp [foldlM_succ_last, ← Nat.add_assoc, ih]; rfl
+    simp [foldlM_succ_last, ← Nat.add_assoc, ih]

 /-! ### foldrM -/

@@ -222,7 +222,7 @@ theorem foldrM_add [Monad m] [LawfulMonad m] (f : Fin (n + k) → α → m α) :
    simp
  | succ k ih =>
    funext x
-    simp [foldrM_succ_last, ← Nat.add_assoc, ih]; rfl
+    simp [foldrM_succ_last, ← Nat.add_assoc, ih]

 /-! ### foldl -/

@@ -268,7 +268,7 @@ theorem foldl_add (f : α → Fin (n + m) → α) (x) :
        (foldl n (fun x i => f x (i.castLE (Nat.le_add_right n m))) x):= by
  induction m with
  | zero => simp
-  | succ m ih => simp [foldl_succ_last, ih, ← Nat.add_assoc]; rfl
+  | succ m ih => simp [foldl_succ_last, ih, ← Nat.add_assoc]

 theorem foldl_eq_foldlM (f : α → Fin n → α) (x) :
    foldl n f x = (foldlM (m := Id) n (pure <| f · ·) x).run := by
@@ -321,7 +321,7 @@ theorem foldr_add (f : Fin (n + m) → α → α) (x) :
        (foldr m (fun i => f (i.natAdd n)) x) := by
  induction m generalizing x with
  | zero => simp
-  | succ m ih => simp [foldr_succ_last, ih, ← Nat.add_assoc]; rfl
+  | succ m ih => simp [foldr_succ_last, ih, ← Nat.add_assoc]

 theorem foldr_eq_foldrM (f : Fin n → α → α) (x) :
    foldr n f x = (foldrM (m := Id) n (pure <| f · ·) x).run := by
--- a/src/Init/Data/Fin/Iterate.lean
+++ b/src/Init/Data/Fin/Iterate.lean
@@ -69,7 +69,7 @@ private theorem hIterateFrom_elim {P : Nat → Sort _}(Q : ∀(i : Nat), P i →
    have g : ¬ (i < n) := by simp at p; simp [p]
    have r : Q n (_root_.cast (congrArg P p) s) :=
      @Eq.rec Nat i (fun k eq => Q k (_root_.cast (congrArg P eq) s)) init n p
-    simp only [g, r, dite_false]
+    simp only [g, dite_false]; exact r
  | succ j inv =>
    unfold hIterateFrom
    have d : Nat.succ i + j = n := by simp [Nat.succ_add]; exact p
--- a/src/Init/Data/Fin/Lemmas.lean
+++ b/src/Init/Data/Fin/Lemmas.lean
@@ -123,7 +123,7 @@ For example, for `x : Fin k` and `n : Nat`,
 it causes `x < n` to be elaborated as `x < ↑n` rather than `↑x < n`,
 silently introducing wraparound arithmetic.
 -/
-@[expose, instance_reducible]
+@[expose, implicit_reducible]
 def instNatCast (n : Nat) [NeZero n] : NatCast (Fin n) where
  natCast a := Fin.ofNat n a

@@ -145,7 +145,7 @@ This is not a global instance, but may be activated locally via `open Fin.IntCas

 See the doc-string for `Fin.NatCast.instNatCast` for more details.
 -/
-@[expose, instance_reducible]
+@[expose, implicit_reducible]
 def instIntCast (n : Nat) [NeZero n] : IntCast (Fin n) where
  intCast := Fin.intCast

--- a/src/Init/Data/FloatArray/Basic.lean
+++ b/src/Init/Data/FloatArray/Basic.lean
@@ -9,6 +9,7 @@ prelude
 public import Init.Data.Float
 import Init.Ext
 public import Init.GetElem
+public import Init.Data.ToString.Extra

 public section
 universe u
--- a/src/Init/Data/Format/Basic.lean
+++ b/src/Init/Data/Format/Basic.lean
@@ -414,7 +414,7 @@ Renders a `Format` to a string.
 -/
 def pretty (f : Format) (width : Nat := defWidth) (indent : Nat := 0) (column := 0) : String :=
  let act : StateM State Unit := prettyM f width indent
-  State.out <| act (State.mk "" column) |>.snd
+  State.out <| act.run (State.mk "" column) |>.snd

 end Format

--- a/src/Init/Data/Hashable.lean
+++ b/src/Init/Data/Hashable.lean
@@ -6,7 +6,7 @@ Authors: Leonardo de Moura
 module

 prelude
-public import Init.Data.String.PosRaw
+import Init.Data.Array.Basic
 public import Init.Data.UInt.Basic

 public section
@@ -15,9 +15,6 @@ universe u
 instance : Hashable Nat where
  hash n := UInt64.ofNat n

-instance : Hashable String.Pos.Raw where
-  hash p := UInt64.ofNat p.byteIdx
-
 instance [Hashable α] [Hashable β] : Hashable (α × β) where
  hash | (a, b) => mixHash (hash a) (hash b)

--- a/src/Init/Data/Int/Pow.lean
+++ b/src/Init/Data/Int/Pow.lean
@@ -118,16 +118,19 @@ theorem toNat_pow_of_nonneg {x : Int} (h : 0 ≤ x) (k : Nat) : (x ^ k).toNat =
  | succ k ih =>
    rw [Int.pow_succ, Int.toNat_mul (Int.pow_nonneg h) h, ih, Nat.pow_succ]

-protected theorem sq_nonnneg (m : Int) : 0 ≤ m ^ 2 := by
+protected theorem sq_nonneg (m : Int) : 0 ≤ m ^ 2 := by
  rw [Int.pow_succ, Int.pow_one]
  cases m
  · apply Int.mul_nonneg <;> simp
  · apply Int.mul_nonneg_of_nonpos_of_nonpos <;> exact negSucc_le_zero _

+@[deprecated Int.sq_nonneg (since := "2026-03-13")]
+protected theorem sq_nonnneg (m : Int) : 0 ≤ m ^ 2 := Int.sq_nonneg m
+
 protected theorem pow_nonneg_of_even {m : Int} {n : Nat} (h : n % 2 = 0) : 0 ≤ m ^ n := by
  rw [← Nat.mod_add_div n 2, h, Nat.zero_add, Int.pow_mul]
  apply Int.pow_nonneg
-  exact Int.sq_nonnneg m
+  exact Int.sq_nonneg m

 protected theorem neg_pow {m : Int} {n : Nat} : (-m)^n = (-1)^(n % 2) * m^n := by
  rw [Int.neg_eq_neg_one_mul, Int.mul_pow]
--- a/src/Init/Data/Iterators/Basic.lean
+++ b/src/Init/Data/Iterators/Basic.lean
@@ -315,7 +315,7 @@ of another state. Having this proof bundled up with the step is important for te
 See `IterM.Step` and `Iter.Step` for the concrete choice of the plausibility predicate.
 -/
@[expose]
-def PlausibleIterStep (IsPlausibleStep : IterStep α β → Prop) := Subtype IsPlausibleStep
+abbrev PlausibleIterStep (IsPlausibleStep : IterStep α β → Prop) := Subtype IsPlausibleStep

 /--
 Match pattern for the `yield` case. See also `IterStep.yield`.
@@ -379,6 +379,8 @@ class Iterator (α : Type w) (m : Type w → Type w') (β : outParam (Type w)) w
  -/
  step : (it : IterM (α := α) m β) → m (Shrink <| PlausibleIterStep <| IsPlausibleStep it)

+attribute [reducible] Iterator.IsPlausibleStep
+
 section Monadic

 /-- The constructor has been renamed. -/
@@ -424,7 +426,6 @@ theorem IterM.toIter_mk {α β} {it : α} :
 Asserts that certain step is plausibly the successor of a given iterator. What "plausible" means
 is up to the `Iterator` instance but it should be strong enough to allow termination proofs.
 -/
-@[expose]
 abbrev IterM.IsPlausibleStep {α : Type w} {m : Type w → Type w'} {β : Type w} [Iterator α m β] :
    IterM (α := α) m β → IterStep (IterM (α := α) m β) β → Prop :=
  Iterator.IsPlausibleStep (α := α) (m := m)
@@ -493,7 +494,7 @@ Asserts that certain step is plausibly the successor of a given iterator. What "
 is up to the `Iterator` instance but it should be strong enough to allow termination proofs.
 -/
@[expose]
-def Iter.IsPlausibleStep {α : Type w} {β : Type w} [Iterator α Id β]
+abbrev Iter.IsPlausibleStep {α : Type w} {β : Type w} [Iterator α Id β]
    (it : Iter (α := α) β) (step : IterStep (Iter (α := α) β) β) : Prop :=
  it.toIterM.IsPlausibleStep (step.mapIterator Iter.toIterM)

@@ -549,7 +550,7 @@ The type of the step object returned by `Iter.step`, containing an `IterStep`
 and a proof that this is a plausible step for the given iterator.
 -/
@[expose]
-def Iter.Step {α : Type w} {β : Type w} [Iterator α Id β] (it : Iter (α := α) β) :=
+abbrev Iter.Step {α : Type w} {β : Type w} [Iterator α Id β] (it : Iter (α := α) β) :=
  PlausibleIterStep (Iter.IsPlausibleStep it)

 /--
--- a/src/Init/Data/Iterators/Combinators.lean
+++ b/src/Init/Data/Iterators/Combinators.lean
@@ -6,6 +6,7 @@ Authors: Paul Reichert
 module

 prelude
+public import Init.Data.Iterators.Combinators.Append
 public import Init.Data.Iterators.Combinators.Monadic
 public import Init.Data.Iterators.Combinators.FilterMap
 public import Init.Data.Iterators.Combinators.FlatMap
--- a/src/Init/Data/Iterators/Combinators/Append.lean
+++ b/src/Init/Data/Iterators/Combinators/Append.lean
@@ -0,0 +1,79 @@
+/-
+Copyright (c) 2026 Lean FRO, LLC. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Paul Reichert
+-/
+module
+
+prelude
+public import Init.Data.Iterators.Combinators.Monadic.Append
+
+public section
+
+namespace Std
+open Std.Iterators Std.Iterators.Types
+
+/--
+Given two iterators `it₁` and `it₂`, `it₁.append it₂` is an iterator that first outputs all values
+of `it₁` in order and then all values of `it₂` in order.
+
+**Marble diagram:**
+
+```text
+it₁                 ---a----b---c--⊥
+it₂                                 --d--e--⊥
+it₁.append it₂      ---a----b---c-----d--e--⊥
+```
+
+**Termination properties:**
+
+* `Finite` instance: only if `it₁` and `it₂` are finite
+* `Productive` instance: only if `it₁` and `it₂` are productive
+
+Note: If `it₁` is not finite, then `it₁.append it₂` can be productive while `it₂` is not.
+The standard library does not provide a `Productive` instance for this case.
+
+**Performance:**
+
+This combinator incurs an additional O(1) cost with each output of `it₁` and `it₂`.
+-/
+@[inline, expose]
+def Iter.append {α₁ : Type w} {α₂ : Type w} {β : Type w}
+    [Iterator α₁ Id β] [Iterator α₂ Id β]
+    (it₁ : Iter (α := α₁) β) (it₂ : Iter (α := α₂) β) :
+    Iter (α := Append α₁ α₂ Id β) β :=
+  (it₁.toIterM.append it₂.toIterM).toIter
+
+/--
+This combinator is only useful for advanced use cases.
+
+Given an iterator `it₂`, returns an iterator that behaves exactly like `it₂` but is of the same
+type as `it₁.append it₂` (after `it₁` has been exhausted).
+This is useful for constructing intermediate states of the append iterator.
+
+**Marble diagram:**
+
+```text
+it₂                        --a--b--⊥
+Iter.appendSnd α₁ it₂      --a--b--⊥
+```
+
+**Termination properties:**
+
+* `Finite` instance: only if `it₂` and iterators of type `α₁` are finite
+* `Productive` instance: only if `it₂` and iterators of type `α₁` are productive
+
+Note: If iterators of type `α₁` are not finite, then `append α₁ it₂` can be productive while `it₂` is not.
+The standard library does not provide a `Productive` instance for this case.
+
+**Performance:**
+
+This combinator incurs an additional O(1) cost with each output of `it₂`.
+-/
+@[inline, expose]
+def Iter.Intermediate.appendSnd {α₂ : Type w} {β : Type w}
+    [Iterator α₂ Id β] (α₁ : Type w) (it₂ : Iter (α := α₂) β) :
+    Iter (α := Append α₁ α₂ Id β) β :=
+  (IterM.Intermediate.appendSnd α₁ it₂.toIterM).toIter
+
+end Std
--- a/src/Init/Data/Iterators/Combinators/Monadic.lean
+++ b/src/Init/Data/Iterators/Combinators/Monadic.lean
@@ -6,6 +6,7 @@ Authors: Paul Reichert
 module

 prelude
+public import Init.Data.Iterators.Combinators.Monadic.Append
 public import Init.Data.Iterators.Combinators.Monadic.FilterMap
 public import Init.Data.Iterators.Combinators.Monadic.FlatMap
 public import Init.Data.Iterators.Combinators.Monadic.Take
--- a/src/Init/Data/Iterators/Combinators/Monadic/Append.lean
+++ b/src/Init/Data/Iterators/Combinators/Monadic/Append.lean
@@ -0,0 +1,261 @@
+/-
+Copyright (c) 2026 Lean FRO, LLC. All rights reserved.
+Released under Apache 2.0 license as described in the file LICENSE.
+Authors: Paul Reichert
+-/
+module
+
+prelude
+public import Init.Data.Iterators.Consumers.Monadic.Loop
+public import Init.Classical
+import Init.Data.Option.Lemmas
+import Init.ByCases
+import Init.Omega
+
+public section
+
+/-!
+This module provides the iterator combinator `IterM.append`.
+-/
+
+namespace Std
+
+variable {α : Type w} {m : Type w → Type w'} {β : Type w}
+
+/--
+The internal state of the `IterM.append` iterator combinator.
+-/
+inductive Iterators.Types.Append (α₁ α₂ : Type w) (m : Type w → Type w') (β : Type w) where
+  | fst : IterM (α := α₁) m β → IterM (α := α₂) m β → Append α₁ α₂ m β
+  | snd : IterM (α := α₂) m β → Append α₁ α₂ m β
+
+open Std.Iterators Std.Iterators.Types
+
+/--
+Given two iterators `it₁` and `it₂`, `it₁.append it₂` is an iterator that first outputs all values
+of `it₁` in order and then all values of `it₂` in order.
+
+**Marble diagram:**
+
+```text
+it₁                 ---a----b---c--⊥
+it₂                                 --d--e--⊥
+it₁.append it₂      ---a----b---c-----d--e--⊥
+```
+
+**Termination properties:**
+
+* `Finite` instance: only if `it₁` and `it₂` are finite
+* `Productive` instance: only if `it₁` and `it₂` are productive
+
+Note: If `it₁` is not finite, then `it₁.append it₂` can be productive while `it₂` is not.
+The standard library does not provide a `Productive` instance for this case.
+
+**Performance:**
+
+This combinator incurs an additional O(1) cost with each output of `it₁` and `it₂`.
+-/
+@[inline, expose]
+def IterM.append [Iterator α₁ m β] [Iterator α₂ m β]
+    (it₁ : IterM (α := α₁) m β) (it₂ : IterM (α := α₂) m β) :=
+  (⟨Iterators.Types.Append.fst it₁ it₂⟩ : IterM m β)
+
+/--
+This combinator is only useful for advanced use cases.
+
+Given an iterator `it₂`, `IterM.Intermediate.appendSnd α₁ it₂` returns an iterator that behaves
+exactly like `it₂` but has the same type as `it₁.append it₂` (after `it₁` has been exhausted).
+This is useful for constructing intermediate states of the append iterator.
+
+**Marble diagram:**
+
+```text
+it₂                                  --a--b--⊥
+IterM.Intermediate.appendSnd α₁ it₂  --a--b--⊥
+```
+
+**Termination properties:**
+
+* `Finite` instance: only if `it₂` and iterators of type `α₁` are finite
+* `Productive` instance: only if `it₂` and iterators of type `α₁` are productive
+
+Note: If iterators of type `α₁` are not finite, then `appendSnd α₁ it₂` can be productive
+while `it₂` is not. The standard library does not provide a `Productive` instance for this case.
+
+**Performance:**
+
+This combinator incurs an additional O(1) cost with each output of `it₂`.
+-/
+@[inline, expose]
+def IterM.Intermediate.appendSnd [Iterator α₂ m β] (α₁ : Type w) (it₂ : IterM (α := α₂) m β) :=
+  (⟨Iterators.Types.Append.snd (α₁ := α₁) it₂⟩ : IterM m β)
+
+namespace Iterators.Types
+
+inductive Append.PlausibleStep [Iterator α₁ m β] [Iterator α₂ m β] :
+    IterM (α := Append α₁ α₂ m β) m β → IterStep (IterM (α := Append α₁ α₂ m β) m β) β → Prop where
+  | fstYield {it₁ : IterM (α := α₁) m β}  {it₂ : IterM (α := α₂) m β} :
+    it₁.IsPlausibleStep (.yield it₁' out) → PlausibleStep (it₁.append it₂) (.yield (it₁'.append it₂) out)
+  | fstSkip {it₁ : IterM (α := α₁) m β} {it₂ : IterM (α := α₂) m β} :
+    it₁.IsPlausibleStep (.skip it₁') → PlausibleStep (it₁.append it₂) (.skip (it₁'.append it₂))
+  | fstDone {it₁ : IterM (α := α₁) m β} {it₂ : IterM (α := α₂) m β} :
+    it₁.IsPlausibleStep .done → PlausibleStep (it₁.append it₂) (.skip (IterM.Intermediate.appendSnd α₁ it₂))
+  | sndYield {it₂ : IterM (α := α₂) m β} :
+    it₂.IsPlausibleStep (.yield it₂' out) →
+      PlausibleStep (IterM.Intermediate.appendSnd α₁ it₂) (.yield (IterM.Intermediate.appendSnd α₁ it₂') out)
+  | sndSkip {it₂ : IterM (α := α₂) m β} :
+    it₂.IsPlausibleStep (.skip it₂') →
+      PlausibleStep (IterM.Intermediate.appendSnd α₁ it₂) (.skip (IterM.Intermediate.appendSnd α₁ it₂'))
+  | sndDone {it₂ : IterM (α := α₂) m β} :
+    it₂.IsPlausibleStep .done → PlausibleStep (IterM.Intermediate.appendSnd α₁ it₂) .done
+
+@[inline]
+instance Append.instIterator [Monad m] [Iterator α₁ m β] [Iterator α₂ m β] :
+    Iterator (Append α₁ α₂ m β) m β where
+  IsPlausibleStep := Append.PlausibleStep
+  step
+    | ⟨.fst it₁ it₂⟩ => do
+      match (← it₁.step).inflate with
+      | .yield it₁' out h => return .deflate <| .yield (it₁'.append it₂) out (.fstYield h)
+      | .skip it₁' h => return .deflate <| .skip (it₁'.append it₂) (.fstSkip h)
+      | .done h => return .deflate <| .skip (IterM.Intermediate.appendSnd α₁ it₂) (.fstDone h)
+    | ⟨.snd it₂⟩ => do
+      match (← it₂.step).inflate with
+      | .yield it₂' out h => return .deflate <| .yield (IterM.Intermediate.appendSnd α₁ it₂') out (.sndYield h)
+      | .skip it₂' h => return .deflate <| .skip (IterM.Intermediate.appendSnd α₁ it₂') (.sndSkip h)
+      | .done h => return .deflate <| .done (.sndDone h)
+
+instance Append.instIteratorLoop {n : Type x → Type x'} [Monad m] [Monad n]
+    [Iterator α₁ m β] [Iterator α₂ m β] :
+    IteratorLoop (Append α₁ α₂ m β) m n :=
+  .defaultImplementation
+
+section Finite
+
+variable {α₁ : Type w} {α₂ : Type w} {m : Type w → Type w'} {β : Type w}
+
+variable (α₁ α₂ m β) in
+def Append.Rel [Monad m] [Iterator α₁ m β] [Iterator α₂ m β] [Finite α₁ m] [Finite α₂ m] :
+    IterM (α := Append α₁ α₂ m β) m β → IterM (α := Append α₁ α₂ m β) m β → Prop :=
+  InvImage
+    (Prod.Lex
+      (Option.lt (InvImage IterM.TerminationMeasures.Finite.Rel IterM.finitelyManySteps))
+      (InvImage IterM.TerminationMeasures.Finite.Rel IterM.finitelyManySteps))
+    (fun it => match it.internalState with
+      | .fst it₁ it₂ => (some it₁, it₂)
+      | .snd it₂ => (none, it₂))
+
+theorem Append.rel_of_fst [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Finite α₁ m] [Finite α₂ m] {it₁ it₁' : IterM (α := α₁) m β} {it₂ : IterM (α := α₂) m β}
+    (h : it₁'.finitelyManySteps.Rel it₁.finitelyManySteps) :
+    Append.Rel α₁ α₂ m β (it₁'.append it₂) (it₁.append it₂) := by
+  exact Prod.Lex.left _ _ h
+
+theorem Append.rel_fstDone [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Finite α₁ m] [Finite α₂ m] {it₁ : IterM (α := α₁) m β} {it₂ : IterM (α := α₂) m β} :
+    Append.Rel α₁ α₂ m β (IterM.Intermediate.appendSnd α₁ it₂) (it₁.append it₂) := by
+  exact Prod.Lex.left _ _ trivial
+
+theorem Append.rel_of_snd [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Finite α₁ m] [Finite α₂ m] {it₂ it₂' : IterM (α := α₂) m β}
+    (h : it₂'.finitelyManySteps.Rel it₂.finitelyManySteps) :
+    Append.Rel α₁ α₂ m β (IterM.Intermediate.appendSnd α₁ it₂') (IterM.Intermediate.appendSnd α₁ it₂) := by
+  exact Prod.Lex.right _ h
+
+def Append.instFinitenessRelation [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Finite α₁ m] [Finite α₂ m] :
+    FinitenessRelation (Append α₁ α₂ m β) m where
+  Rel := Append.Rel α₁ α₂ m β
+  wf := by
+    apply InvImage.wf
+    refine ⟨fun (a, b) => Prod.lexAccessible (WellFounded.apply ?_ a) (WellFounded.apply ?_) b⟩
+    · exact Option.wellFounded_lt <| InvImage.wf _ WellFoundedRelation.wf
+    · exact InvImage.wf _ WellFoundedRelation.wf
+  subrelation {it it'} h := by
+    obtain ⟨step, h, h'⟩ := h
+    cases h' <;> cases h
+    case fstYield =>
+      apply Append.rel_of_fst
+      exact IterM.TerminationMeasures.Finite.rel_of_yield ‹_›
+    case fstSkip =>
+      apply Append.rel_of_fst
+      exact IterM.TerminationMeasures.Finite.rel_of_skip ‹_›
+    case fstDone =>
+      exact Append.rel_fstDone
+    case sndYield =>
+      apply Append.rel_of_snd
+      exact IterM.TerminationMeasures.Finite.rel_of_yield ‹_›
+    case sndSkip =>
+      apply Append.rel_of_snd
+      exact IterM.TerminationMeasures.Finite.rel_of_skip ‹_›
+
+@[no_expose]
+public instance Append.instFinite [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Finite α₁ m] [Finite α₂ m] : Finite (Append α₁ α₂ m β) m :=
+  .of_finitenessRelation instFinitenessRelation
+
+end Finite
+
+section Productive
+
+variable {α₁ : Type w} {α₂ : Type w} {m : Type w → Type w'} {β : Type w}
+
+variable (α₁ α₂ m β) in
+def Append.ProductiveRel [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Productive α₁ m] [Productive α₂ m] :
+    IterM (α := Append α₁ α₂ m β) m β → IterM (α := Append α₁ α₂ m β) m β → Prop :=
+  InvImage
+    (Prod.Lex
+      (Option.lt (InvImage IterM.TerminationMeasures.Productive.Rel IterM.finitelyManySkips))
+      (InvImage IterM.TerminationMeasures.Productive.Rel IterM.finitelyManySkips))
+    (fun it => match it.internalState with
+      | .fst it₁ it₂ => (some it₁, it₂)
+      | .snd it₂ => (none, it₂))
+
+theorem Append.productiveRel_of_fst [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Productive α₁ m] [Productive α₂ m] {it₁ it₁' : IterM (α := α₁) m β}
+    {it₂ : IterM (α := α₂) m β}
+    (h : it₁'.finitelyManySkips.Rel it₁.finitelyManySkips) :
+    Append.ProductiveRel α₁ α₂ m β (it₁'.append it₂) (it₁.append it₂) := by
+  exact Prod.Lex.left _ _ h
+
+theorem Append.productiveRel_fstDone [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Productive α₁ m] [Productive α₂ m] {it₁ : IterM (α := α₁) m β}
+    {it₂ : IterM (α := α₂) m β} :
+    Append.ProductiveRel α₁ α₂ m β (IterM.Intermediate.appendSnd α₁ it₂) (it₁.append it₂) := by
+  exact Prod.Lex.left _ _ trivial
+
+theorem Append.productiveRel_of_snd [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Productive α₁ m] [Productive α₂ m] {it₂ it₂' : IterM (α := α₂) m β}
+    (h : it₂'.finitelyManySkips.Rel it₂.finitelyManySkips) :
+    Append.ProductiveRel α₁ α₂ m β
+      (IterM.Intermediate.appendSnd α₁ it₂') (IterM.Intermediate.appendSnd α₁ it₂) := by
+  exact Prod.Lex.right _ h
+
+private def Append.instProductivenessRelation [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Productive α₁ m] [Productive α₂ m] :
+    ProductivenessRelation (Append α₁ α₂ m β) m where
+  Rel := Append.ProductiveRel α₁ α₂ m β
+  wf := by
+    apply InvImage.wf
+    refine ⟨fun (a, b) => Prod.lexAccessible (WellFounded.apply ?_ a) (WellFounded.apply ?_) b⟩
+    · exact Option.wellFounded_lt <| InvImage.wf _ WellFoundedRelation.wf
+    · exact InvImage.wf _ WellFoundedRelation.wf
+  subrelation {it it'} h := by
+    cases h
+    case fstSkip =>
+      apply Append.productiveRel_of_fst
+      exact IterM.TerminationMeasures.Productive.rel_of_skip ‹_›
+    case fstDone =>
+      exact Append.productiveRel_fstDone
+    case sndSkip =>
+      apply Append.productiveRel_of_snd
+      exact IterM.TerminationMeasures.Productive.rel_of_skip ‹_›
+
+instance Append.instProductive [Monad m] [Iterator α₁ m β] [Iterator α₂ m β]
+    [Productive α₁ m] [Productive α₂ m] : Productive (Append α₁ α₂ m β) m :=
+  .of_productivenessRelation instProductivenessRelation
+
+end Productive
+
+end Std.Iterators.Types
--- a/src/Init/Data/Iterators/Combinators/Monadic/FlatMap.lean
+++ b/src/Init/Data/Iterators/Combinators/Monadic/FlatMap.lean
@@ -362,8 +362,7 @@ def Flatten.instProductivenessRelation [Monad m] [Iterator α m (IterM (α := α
    case innerDone =>
      apply Flatten.productiveRel_of_right₂

-@[no_expose]
-public def Flatten.instProductive [Monad m] [Iterator α m (IterM (α := α₂) m β)] [Iterator α₂ m β]
+public theorem Flatten.instProductive [Monad m] [Iterator α m (IterM (α := α₂) m β)] [Iterator α₂ m β]
    [Finite α m] [Productive α₂ m] : Productive (Flatten α α₂ β m) m :=
  .of_productivenessRelation instProductivenessRelation

--- a/src/Init/Data/Iterators/Consumers/Loop.lean
+++ b/src/Init/Data/Iterators/Consumers/Loop.lean
@@ -35,7 +35,7 @@ A `ForIn'` instance for iterators. Its generic membership relation is not easy t
 so this is not marked as `instance`. This way, more convenient instances can be built on top of it
 or future library improvements will make it more comfortable.
 -/
-@[always_inline, inline]
+@[always_inline, inline, expose, implicit_reducible]
 def Iter.instForIn' {α : Type w} {β : Type w} {n : Type x → Type x'} [Monad n]
    [Iterator α Id β] [IteratorLoop α Id n] :
    ForIn' n (Iter (α := α) β) β ⟨fun it out => it.IsPlausibleIndirectOutput out⟩ where
@@ -53,7 +53,7 @@ instance (α : Type w) (β : Type w) (n : Type x → Type x') [Monad n]
 /--
 An implementation of `for h : ... in ... do ...` notation for partial iterators.
 -/
-@[always_inline, inline]
+@[always_inline, inline, expose, implicit_reducible]
 def Iter.Partial.instForIn' {α : Type w} {β : Type w} {n : Type x → Type x'} [Monad n]
    [Iterator α Id β] [IteratorLoop α Id n] :
    ForIn' n (Iter.Partial (α := α) β) β ⟨fun it out => it.it.IsPlausibleIndirectOutput out⟩ where
@@ -71,7 +71,7 @@ instance (α : Type w) (β : Type w) (n : Type x → Type x') [Monad n]
 A `ForIn'` instance for iterators that is guaranteed to terminate after finitely many steps.
 It is not marked as an instance because the membership predicate is difficult to work with.
 -/
-@[always_inline, inline]
+@[always_inline, inline, expose, implicit_reducible]
 def Iter.Total.instForIn' {α : Type w} {β : Type w} {n : Type x → Type x'} [Monad n]
    [Iterator α Id β] [IteratorLoop α Id n] [Finite α Id] :
    ForIn' n (Iter.Total (α := α) β) β ⟨fun it out => it.it.IsPlausibleIndirectOutput out⟩ where
--- a/src/Init/Data/Iterators/Consumers/Monadic/Loop.lean
+++ b/src/Init/Data/Iterators/Consumers/Monadic/Loop.lean
@@ -159,7 +159,7 @@ This is the default implementation of the `IteratorLoop` class.
 It simply iterates through the iterator using `IterM.step`. For certain iterators, more efficient
 implementations are possible and should be used instead.
 -/
-@[always_inline, inline, expose]
+@[always_inline, inline, expose, implicit_reducible]
 def IteratorLoop.defaultImplementation {α : Type w} {m : Type w → Type w'} {n : Type x → Type x'}
    [Monad n] [Iterator α m β] :
    IteratorLoop α m n where
@@ -211,7 +211,7 @@ theorem IteratorLoop.wellFounded_of_productive {α β : Type w} {m : Type w →
 /--
 This `ForIn'`-style loop construct traverses a finite iterator using an `IteratorLoop` instance.
 -/
-@[always_inline, inline]
+@[always_inline, inline, expose, implicit_reducible]
 def IteratorLoop.finiteForIn' {m : Type w → Type w'} {n : Type x → Type x'}
    {α : Type w} {β : Type w} [Iterator α m β] [IteratorLoop α m n] [Monad n]
    (lift : ∀ γ δ, (γ → n δ) → m γ → n δ) :
@@ -224,7 +224,7 @@ A `ForIn'` instance for iterators. Its generic membership relation is not easy t
 so this is not marked as `instance`. This way, more convenient instances can be built on top of it
 or future library improvements will make it more comfortable.
 -/
-@[always_inline, inline]
+@[always_inline, inline, expose, implicit_reducible]
 def IterM.instForIn' {m : Type w → Type w'} {n : Type w → Type w''}
    {α : Type w} {β : Type w} [Iterator α m β] [IteratorLoop α m n] [Monad n]
    [MonadLiftT m n] :
@@ -239,7 +239,7 @@ instance IterM.instForInOfIteratorLoop {m : Type w → Type w'} {n : Type w →
  instForInOfForIn'

 /-- Internal implementation detail of the iterator library. -/
-@[always_inline, inline]
+@[always_inline, inline, expose, implicit_reducible]
 def IterM.Partial.instForIn' {m : Type w → Type w'} {n : Type w → Type w''}
    {α : Type w} {β : Type w} [Iterator α m β] [IteratorLoop α m n] [MonadLiftT m n] [Monad n] :
    ForIn' n (IterM.Partial (α := α) m β) β ⟨fun it out => it.it.IsPlausibleIndirectOutput out⟩ where
@@ -247,7 +247,7 @@ def IterM.Partial.instForIn' {m : Type w → Type w'} {n : Type w → Type w''}
    haveI := @IterM.instForIn'; forIn' it.it init f

 /-- Internal implementation detail of the iterator library. -/
-@[always_inline, inline]
+@[always_inline, inline, expose, implicit_reducible]
 def IterM.Total.instForIn' {m : Type w → Type w'} {n : Type w → Type w''}
    {α : Type w} {β : Type w} [Iterator α m β] [IteratorLoop α m n] [MonadLiftT m n] [Monad n]
    [Finite α m] :
--- a/Show More
+++ b/Show More